Big Data/Google File System



The wikipedia article on Google File System explains the following paper from google entitled Google File System

Problem
Need Storage Server that supports
 * many 100TB of data
 * distributed along many 1000 servers of cheap hardware
 * serving many 100 of clients.

Tasks: Many of those tasks can be based on map reduce.
 * Build and store Google search index
 * Store and analyze Log Files of Production servers and Google Analytics
 * YouTube (!)
 * Research computations

Existing Solutions
RAID RAID Redundant array of inexpensive/independent disks offers block level access to multiple hard drives attached to a single machine.

Thesis on RAID

SAN Storage Area Network Dedicated High Speed Network that connect storage nodes over ATA/SCSI. Typically over FiberChannel or Ethernet.

NAS Network Attached Storage Small server optimized for serving files via SMB/NFS/FTP.

Distributed FS Distributed FS


 * GPFS (IBM)
 * Lustre
 * PVFS

Sequels
 * Hadoop FS
 * KFS

API Levels
Block Level API for communication via ATA or SCSI between the disc controller and the CPU offers access to the individual blocks of the hard drive. There is no file system abstraction given. Block devices have to be formatted before use. API Description?

The File System API offered by the OS / device driver.
 * GLIB C FS-API Reference,
 * fcntl.h

File Level API offered by protocols like FTP/SMB or NFS.
 * Network File System Protocol to access data on remote drives.
 * FTP Protocol

Further reading: Techrepublic.com

Specifications
Run on cluster of commodity servers. Serves clients in the same network.

Offer simplified File Level API to clients
 * Open / Close file specified by path
 * Read file at arbitrary locations (fast)
 * Write file at arbitrary locations (may be slow)
 * Append to file (fast)

Restrictions:
 * Scalability: Automatic adjust when new nodes join the cluster
 * Fault Tolerance: Deal with hardware failures
 * Concurrency: Allow multiple clients to access the same files at the same time.

Optimizations:
 * Locality: Perform computations locally if possible
 * Concurrent appends should be fast
 * High throughput (at cost of high latency)

Baseline Ideas

 * Mount all devices at a single machine via e.g. NFS. Expose to clients via FTP interface.
 * Big RAID at one machine
 * Storage Area Network