S3 Select Command Currently in Preview Mode

S3 Select command is released in preview in 2017 which provides the ability to query a S3 object and only retrieve the necessary data. This improves the performance of retrieving data from S3 and also reduce the running cost of applications that query data from S3.

Many applications do not require the full data from a S3 object but only requires partial data and this is where the s3 select command comes into play. By decreasing the data that needs to be retrieved from S3, applications can reduce the data that needs to be processed and increase the performance of the application.

During preview mode, the S3 select command is available in Lambda (using Python and Java).

Below is a source code (pseudo code) using JavaScript SDK that could be used using a Lambda function.


        const AWS = require('aws-sdk');
        const s3 = new AWS.S3();
        
        var bucketName = 'my-demo-bucket';
        var filename = 'file.csv';
        
        var queryCsv = {
                'ExpressionType': 'SQL',
                'Expression': 'SELECT s._1 FROM S3Object AS s',
                'InputSerialization': {
                    'CompressionType': 'NONE',
                    'CSV': {
                        'FileHeaderInfo': 'IGNORE',
                        'RecordDelimiter': '\n',
                        'FieldDelimiter': ',',
                    }
                },
                'OutputSerialization': {
                    'CSV': {
                        'RecordDelimiter': '\n',
                        'FieldDelimiter': ',',
                    }
                }
            };
        
        
        exports.handler = (event, context, callback) => {
            
            // the selectObjectContent method might be called something else            
            s3.selectObjectContent({ Bucket: bucketName, Key: filename, SelectRequest : queryCsv }, function (err, data) {
                    if (!err) {
                        console.log(data);
                    }
                    else
                        console.log(err);
                });
        };