Originally posted by Hi-Angel
View Post
Think in terms of network filesystems: like samba/cifs, or NFS, or apple's netatalk...
Your computer mounts a filesystem that is served over the network through a samba server.
That works on a small scale (a small office with a dozen of workstations and a server).
But that scale badly on the size of whole giant clusters (several hundred of nodes):
if several nodes start to fight for the same file, or if a huge amount of nodes start hammering your poor Samba or NFS server, it won't hold the load nicely, and all the nodes will experience delays waiting to obtain exclusive access to some file.
That's where *cluster* filesystems enter:
they are designed for highly parallel access, from a massive number of node, with high throughput and quickresponse.
Usually, for some increased performance, they use some specialised form of networking, like Infiniband, or 10Gbps ethernet *with DMA*, etc.
They are able to have several server coordinating and spreading the load.
Coordination is (supposed to be) fast, so nodes don't have locks or slowdown and can compute what they need without too much delays for data.
Competitors to OrangeFS would be things like Lustre, Ceph, GlusterFS, GFS, IBM's GPFS, Google FileSystem, etc.
Technologies varies (all nodes access to the same disks server over SAN with something like Fiberchannel, files are accessed over network not unlike NFS, all nodes are peer-to-peer, etc.) but all target the same kind of uses keys:
hundreds of nodes, accessing peta-bytes of data, where performance means a lot.
Comment