Loading...
Please wait, while we are loading the content...
System and Method for Leveraging Key-value Storage to Efficiently Store Data and Metadata in A Distributed File System
| Content Provider | The Lens |
|---|---|
| Abstract | A solid-state drive (SSD) includes: a plurality of data blocks; a plurality of flash channels and a plurality of ways to access the plurality of data blocks; and an SSD controller that configures a block size of the plurality of data blocks. A data file is stored in the SSD with one or more key-values pairs, and each key-value pair has a block identifier as a key and a block data as a value. A size of the data file is equal to the block size or a multiple of the block size. |
| Related Links | https://www.lens.org/lens/patent/009-434-845-606-967/frontpage |
| Language | English |
| Publisher Date | 2019-08-08 |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Patent |
| Jurisdiction | United States of America |
| Date Applied | 2018-03-23 |
| Applicant | Samsung Electronics Co Ltd |
| Application No. | 201815934747 |
| Claim | A solid-state drive (SSD) comprising: a plurality of data blocks; a plurality of flash channels and a plurality of ways to access the plurality of data blocks; and an SSD controller that configures a block size of the plurality of data blocks, wherein a data file is stored in the SSD with one or more key-values pairs, and at least one key-value pair has a block identifier as a key and a block data as a value, and wherein a size of the data file is equal to the block size or a multiple of the block size. The SSD of claim 1 , wherein the SSD is used in a distributed file system including Hadoop Distributed File System (HDFS). The SSD of claim 1 , the SSD controller further configures to enable or disable block updates based on a block update flag. The SSD of claim 1 , the SSD controller further configures to align the data file with the plurality of data blocks based on an alignment flag. The SSD of claim 1 , wherein the block size is determined based on an erase unit of the SSD multiplied by a number of flash channels. The SSD of claim 1 , wherein the block size is determined based on an erase unit of the SSD multiplied by a number of ways. The SSD of claim 1 , wherein the block size is equal to an erase unit of the SSD. The SSD of claim 1 , wherein the SSD stores a file mapping table including a first mapping of the file to one or more data blocks of the plurality of data blocks associated with the file, and a second mapping of at least one of the one or more data blocks to a data node including the SSD. A distributed data storage system comprising: a client; a name node comprising a first key-value (KV) solid-state drive (SSD); and a data node comprising a second KV SSD, wherein the second KV SSD comprises a plurality of data blocks, a plurality of flash channels and a plurality of ways to access the plurality of data blocks, and an SSD controller that configures a block size of the plurality of data blocks, wherein the client sends a create file request including a file identifier to store a data file to the name node and send an allocate command to the name node to allocate one or more data blocks of the plurality of data blocks associated with the data file, wherein the name node returns a block identifier of the one or more data blocks and a data node identifier of the data node that is assigned to store the one or more data blocks to the client, wherein the client sends a block store command to the data node to store the one or more data blocks, wherein the second KV SSD stores the one or more data blocks as key-values pairs, and at least one key-value pair has the block identifier as a key and a block data as a value, and wherein a size of the data file is equal to the block size or a multiple of the block size. The distributed data storage system of claim 9 , wherein the distributed data storage system employs Hadoop Distributed File System (HDFS). The distributed data storage system of claim 9 , wherein the second KV SSD stores a file mapping table including a first mapping of the data file to one or more data blocks associated with the file, and a second mapping of at least one of the one or more data blocks to a data node. A method comprising: sending a create file request from a client to a name node, wherein the create file request includes a file identifier to store a data file; storing the file identifier as a key-value pair in a first key-value (KV) solid-state drive (SSD) of the name node, wherein the file identifier is stored in the key-value as a key, and a value associated with the key is empty; sending an allocate command from the client to the name node to allocate one or more data blocks associated with the data file; assigning, at the name node, a block identifier to at least one of the one or more data blocks and assigning a data node to store the one or more data blocks; returning the block identifier and a data node identifier of the data node from the name node to the client; sending a write block request from the client to the data node, wherein the write block request includes the block identifier and content; and saving the one or more data blocks in a second KV SSD of the data node as key-value pairs, wherein the second KV SSD of the data node comprises one or more data blocks having a block size, wherein at least one key-value pair has a block identifier as a key and a block data as a value, and wherein a size of the data file is equal to the block size or a multiple of the block size. The method of claim 12 , the client, the name node, and the data node are nodes in a Hadoop Distributed File System (HDFS). The method of claim 12 , further comprising setting a block update flag to enable or disable block updates. The method of claim 12 , further comprising setting an alignment flag to align the data file with the plurality of data blocks of the second KV SSD of the data node. The method of claim 12 , further comprising: sending a write commit command from the client to the name node including the file identifier and the block identifier; and appending a single direct operation to append the file identifier, the block identifier, and the data node in the name node. The method of claim 16 , further comprising: sending a read file request to read the data file from the client to the name node; returning the block identifier and the data node identifier for at least one of the one or more data blocks associated with the data file to the client; sending a block read command from the client to the data node to retrieve the one or more data blocks stored in the second KV SSD of the data node; and returning the block data identified by the block identifier from the data node to the client. The method of claim 17 , further comprising: sending a file delete command from the client to the name node including the file identifier; returning the block identifier and the data node identifier for at least one of the one or more data blocks associated with the data file to the client; sending a key-value delete command including the file identifier of the data file from the name node to the first KV SSD of the name node; sending a block delete command from the name node to the data node including a list of the one or more data blocks; and deleting the one or more data blocks stored in the second KV SSD of the data node. The method of claim 12 , wherein the second KV SSD stores a file mapping table including a first mapping of the file to one or more data blocks associated with the file, and a second mapping of at least one of the one or more data blocks to the data node. |
| CPC Classification | ELECTRIC DIGITAL DATA PROCESSING |
| Extended Family | 063-154-526-917-559 009-434-845-606-967 065-489-285-019-594 050-806-423-869-282 195-646-618-114-531 100-431-237-273-454 145-822-904-446-517 008-331-119-700-723 157-180-875-665-203 |
| Patent ID | 20190243906 |
| Inventor/Author | Bisson Timothy C Shayesteh Anahita Choi Changho |
| IPC | G06F3/06 |
| Status | Active |
| Owner | Samsung Electronics Co. Ltd |
| Simple Family | 009-434-845-606-967 065-489-285-019-594 063-154-526-917-559 050-806-423-869-282 195-646-618-114-531 100-431-237-273-454 145-822-904-446-517 008-331-119-700-723 157-180-875-665-203 |
| CPC (with Group) | G06F16/182 G06F3/0608 G06F3/064 G06F3/0643 G06F3/067 G06F3/0679 G06F3/0658 G06F16/13 G06F16/16 |
| Issuing Authority | United States Patent and Trademark Office (USPTO) |
| Kind | Patent Application Publication |