Skip to content
Change the repository type filter

All

    Repositories list

    • ColBERT

      Public
      ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
      Python
      MIT License
      3893.1k7718Updated Nov 18, 2024Nov 18, 2024
    • ARES

      Public
      Automated Evaluation of RAG Systems
      Python
      Apache License 2.0
      53487102Updated Nov 4, 2024Nov 4, 2024
    • FrugalGPT

      Public
      FrugalGPT: better quality and lower cost for LLM applications
      Jupyter Notebook
      Apache License 2.0
      2118730Updated Sep 19, 2024Sep 19, 2024
    • stk

      Public
      Python
      Apache License 2.0
      209021Updated Aug 26, 2024Aug 26, 2024
    • gavel

      Public
      Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020
      Jupyter Notebook
      MIT License
      3112582Updated Jul 25, 2024Jul 25, 2024
    • InQuest

      Public
      Accelerating Aggregation Queries on Unstructured Streams of Data
      Python
      2710Updated Apr 18, 2024Apr 18, 2024
    • Ongoing research training transformer models at scale
      Python
      Other
      2.4k3302Updated Jan 19, 2024Jan 19, 2024
    • tasti

      Public
      Semantic Indexes for Machine Learning-based Queries over Unstructured Data (SIGMOD 2022)
      Python
      51500Updated Jan 17, 2024Jan 17, 2024
    • omg

      Public
      Python
      Apache License 2.0
      32000Updated Sep 20, 2023Sep 20, 2023
    • abae

      Public
      Accelerating Approximate Aggregation Queries with Expensive Predicates (VLDB 21)
      Python
      1300Updated Sep 20, 2023Sep 20, 2023
    • FAST

      Public
      End-to-end earthquake detection pipeline via efficient time series similarity search
      Jupyter Notebook
      Apache License 2.0
      56145121Updated Jul 6, 2023Jul 6, 2023
    • Willump

      Public
      Willump Is a Low-Latency Useful Machine learning Platform.
      Python
      MIT License
      84301Updated Mar 24, 2023Mar 24, 2023
    • Code for POP (SOSP 2021) and NCFlow (NSDI 2021)
      Jupyter Notebook
      12000Updated Mar 7, 2023Mar 7, 2023
    • macrobase

      Public
      MacroBase: A Search Engine for Fast Data
      Java
      Apache License 2.0
      1266612012Updated Dec 14, 2022Dec 14, 2022
    • cs245-as1

      Public
      Student files for CS245 Programming Assignment 1: In-memory data layout
      Java
      Apache License 2.0
      501202Updated Nov 16, 2022Nov 16, 2022
    • Algorithms for compressing and merging large collections of sketches
      Jupyter Notebook
      Apache License 2.0
      2503Updated Nov 16, 2022Nov 16, 2022
    • Model Performance Estimation and Explanation When Labels and A Few Features Shifts
      Python
      0800Updated Nov 7, 2022Nov 7, 2022
    • loa

      Public
      Public code for LOA
      Python
      Apache License 2.0
      21800Updated Oct 5, 2022Oct 5, 2022
    • Java
      2400Updated Jun 1, 2022Jun 1, 2022
    • ezmode

      Public
      An iterative algorithm for selecting rare events in large, unlabeled datasets
      Python
      0100Updated May 25, 2022May 25, 2022
    • teavar

      Public
      Julia
      10000Updated May 3, 2022May 3, 2022
    • smol

      Public
      C++
      Apache License 2.0
      3500Updated Apr 3, 2022Apr 3, 2022
    • Uniserve

      Public
      A runtime implementation of data-parallel actors.
      Java
      MIT License
      63820Updated Apr 1, 2022Apr 1, 2022
    • Scala
      29900Updated Jan 26, 2022Jan 26, 2022
    • Baleen

      Public
      Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)
      Python
      MIT License
      54300Updated Dec 27, 2021Dec 27, 2021
    • POP

      Public
      Code for "Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP", which appeared at SOSP 2021
      Python
      MIT License
      42400Updated Dec 15, 2021Dec 15, 2021
    • supg

      Public
      Python
      Apache License 2.0
      4510Updated Jul 29, 2021Jul 29, 2021
    • Sinkhorn Label Allocation is a label assignment method for semi-supervised self-training algorithms. The SLA algorithm is described in full in this ICML 2021 paper: https://arxiv.org/abs/2102.08622.
      Python
      MIT License
      35300Updated Jun 15, 2021Jun 15, 2021
    • blazeit

      Public
      Its BlazeIt because it's blazing fast
      C++
      103260Updated Jun 12, 2021Jun 12, 2021
    • Simple benchmark for Redis geosets for top-k queries.
      Rust
      0000Updated May 28, 2021May 28, 2021