Skip to content

chetankar65/dfs

Repository files navigation

Distributed File system (DFS)

The different modules:

  • Storage server(s): Physical devices on which files are stored.
  • Naming server: The main server which clients interact with. It contains a global directory tree which maps all the files and their metadata to the respective storage servers they are on and provide a unified interface.
  • Client: Clients are responsible for creating, updating and reading files. They store these files on storage servers by interacting through the naming server.

High level system design:

alt text

Features implemented:

  1. Naming servers, Storage servers and client configuration.
  2. Global directory tree as the single source of truth for global paths.
  3. Utility functions for Hashing, JSON parsing, selection of storage servers (round robin)
  4. Metadata for every file (file size)
  5. Prevention of duplicate uploads
  6. Locking mechanisms

Features im too lazy to implement because they are too tedious:

  1. Deadlock resolution
  2. Replication

Flow:

  1. Storage servers registers with a central "Naming server" and send its entire directory structure (only current directory) to the naming server. If the path on the Storage server is /doc/a.txt, then the path of the file in the global tree will be /root/doc/a.txt. Thus the directory tree in the Naming server is the single source of truth.
  2. Clients can request naming server to give space for file upload.
  3. The naming server selects a storage server and sends the server details to the client.
  4. Client then uploads the file to the specific storage server and the global tree is updated.
  5. Clients can list files in the global directory tree and can perform read, update and delete operations on the specified file.
  6. Trivial locking mechanisms are there to prevent issues during concurrent writes to he same file.

The distributed file system follows a tree structure, similar to local filesystems in operating systems: image

Requirements

  • Java 11 or newer (for java.net.http API)

Compile

javac NamingServer.java Client.java

Run server (in one terminal)

java NamingServer

Run Storage Server (in another terminal) - Can be more than one

java StorageServer

Run client (in another terminal) - Can be more than one

java Client

References:

CMU Distributed File system: https://www.andrew.cmu.edu/course/14-736-s20/applications/labs/proj3/proj3.pdf?

About

Distributed file system in java

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages