Skip to content

apache/datasketches

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

DataSketches is now Apache DataSketches.

DataSketches is an open source, high-performance library of stochastic streaming algorithms commonly called "sketches" in the data sciences. Sketches are small, stateful programs that process massive data as a stream and can provide approximate answers, with mathematical guarantees, to computationally difficult queries orders-of-magnitude faster than traditional, exact methods.

In 2019, after 8 years of development and 5 years as in Open Source, we began the important migration from a stand-alone GitHub site to being a member of the Apache Software Foundation community. As of December, 2020, we became an official Top-Level Project within the ASF.

After years of development and community building, we now have parallel core library components for Java, C++, Python, and Go implementations of many of the same sketch algorithms:

Please visit the main DataSketches website for more information.

For issues or questions, please see our Community page.

If you are looking for one of our old repository sites, please refer to this transition page.

About

Apache datasketches

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 10