The collaborators for this project and the work performed by them,
- ndudhel Neel Dudheliya worked on chunk-based approach
- tgandhi Tanay Gandhi worked on clustering and naive approach
- pjibhak Pranav Jibhakate worked on distributed approach.
Each branch has different approaches in them and hence we were not able to merge it into one. The code in these branches are used during our Demo and test results. Every branch has its own owner(collaborator), setup guide and details in the individual Readme.md
- Naive approach - dev
- Chunk based approach - parallel-search
- Clustering approach (K-Means) - clustering
- Distributed approach - distributed-with-docker
brew install go@1.22 gcc musl pkg-config sqlite curl zlib docker docker-compose
brew link go@1.22 --overwrite --forcesudo apt-get update
sudo apt-get install build-essential musl-tools pkg-config libsqlite3-dev zlib1g-dev libcurl4-openssl-dev docker-ce curl
sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
wget https://go.dev/dl/go1.22.0.linux-amd64.tar.gz
sudo tar -C /usr/local -xvzf go1.22.0.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/binFor all testing all the approaches you will need datasets. We have used datasets from Loghub and Zenodo. The datasets we have used in Results/Evaluation are:
| Dataset | Compressed/Download Size | Uncompressed/Original Size |
|---|---|---|
| Mac | 1.5 MB | 16.9 MB |
| OpenSSH | 4.6 MB | 70.02 MB |
| Android_v1 | 24.9 MB | 183.37 MB |
| HDFS_v1 | 186.6 MB | 1.47 GB |
| Hive | 128.5 MB | 2.13 GB |
| Spark | 183.5 MB | 2.75 GB |