docs

Getting started

Foreword

The library is written using TypeScript, so the best way to get to know how it works is to look through the code lib/, tests/e2e and examples.

If you find any mistakes, misleading or some confusion feel free to create an issue or send a pull request.

Example

const { DBSQLClient } = require('@databricks/sql');

const client = new DBSQLClient();
const utils = DBSQLClient.utils;

client.connect({ 
    host: '...', 
    path: '/sql/1.0/endpoints/****************', 
    token: 'dapi********************************', 
}).then(async client => {
    const session = await client.openSession();
    
    const createTableOperation = await session.executeStatement(
        'CREATE TABLE IF NOT EXISTS pokes (foo INT, bar STRING)'
    );
    await utils.waitUntilReady(createTableOperation, false, () => {});
    await createTableOperation.close();
    
    const loadDataOperation = await session.executeStatement(
        'INSERT INTO pokes VALUES(123, "Hello, world!"'
    );
    await utils.waitUntilReady(loadDataOperation, false, () => {});
    await loadDataOperation.close();
    
    const selectDataOperation = await session.executeStatement(
        'SELECT * FROM pokes', { runAsync: true }
    );
    await utils.waitUntilReady(selectDataOperation, false, () => {});
    await utils.fetchAll(selectDataOperation);
    await selectDataOperation.close();
    
    const result = utils.getResult(selectDataOperation).getValue();
    
    console.log(JSON.stringify(result, null, '\t'));
    
    await session.close();
    await client.close();
})
.catch(error => {
    console.error(error);
});

Error handling

You may guess that some errors related to the network are thrown asynchronously and the driver does not maintain these cases, you should handle it on your own. The simplest way is to subscribe on "error" event:

client.on('error', (error) => {
    // ...
});

HiveSession

After you connect to the server you should open session to start working with Hive server.

...
const session = await client.openSession();

To open session you must provide OpenSessionRequest - the only required parameter is "client_protocol", which synchronizes the version of HiveServer2 API.

Into "configuration" you may set any of the configurations that required for the session of your Hive instance.

After the session is opened you will have the HiveSession instance.

Class HiveSession is a facade for API that works with SessionHandle.

The method you will use the most is executeStatement

...
const operation = await session.executeStatement(
    'CREATE TABLE IF NOT EXISTS pokes (foo INT, bar STRING)',
    { runSync: true }
);

"statement" is DDL/DML statement (CREATE TABLE, INSERT, UPDATE, SELECT, LOAD, etc.)
options
- runAsync allows executing operation asynchronously.
- confOverlay overrides session configuration properties.
- timeout is the maximum time to execute an operation. It has Buffer type because timestamp in Hive has capacity 64. So for such value, you should use node-int64 npm module.

To know other methods see IHiveSession and examples/session.js.

HiveOperation

In most cases, HiveSession methods return HiveOperation, which helps you to retrieve requested data.

After you fetch the result, the operation will have TableSchema and data (Array<RowSet>).

HiveUtils

Operation is executed asynchrnously, so before retrieving the result, you have to wait until it has finished state.

...
const response = await operation.status();
const isReady = response.operationState === TCLIService_types.TOperationState.FINISHED_STATE;

Also, the result is fetched by portitions, the size of a portion you can set by method setMaxRows().

...
operation.setMaxRows(500);
const status = await operation.fetch();

After you fetch all data and you have schema and set of data, you can transfrom data in readable format.

...
const schema = operation.getSchema();
const data = operation.getData();

To simplify this process, you may use HiveUtils.

/**
 * Executes until operation has status finished or has one of the invalid states.
 * 
 * @param operation operation to perform
 * @param progress flag for operation status command. If it sets true, response will include progressUpdateResponse with progress information
 * @param callback if callback specified it will be called each time the operation status response received and it will be passed as first parameter
 */
waitUntilReady(
    operation: IOperation,
    progress?: boolean,
    callback?: Function
): Promise<IOperation>

/**
 * Fetches data until operation hasMoreRows.
 * 
 * @param operation
 */
fetchAll(operation: IOperation): Promise<IOperation>

/**
 * Transforms operation result
 * 
 * @param operation operation to perform
 * @param resultHandler you may specify your own handler. If not specified the result is transformed to JSON
 */
getResult(
    operation: IOperation,
    resultHandler?: IOperationResult
): IOperationResult

NOTICE

node-int64 is used for types with capacity 64
to know how data is presented in JSON you may look at JsonResult.test.js

For more details see IOperation.

Example

const { DBSQLClient } = require('@databricks/sql');
const utils = DBSQLClient.utils;
...
await utils.waitUntilReady(
    operation,
    true,
    (stateResponse) => {
        console.log(stateResponse.taskStatus);
    }
);
await utils.fetchAll(operation);

const result = utils.getResult(operation).getValue();

Status

You may notice, that most of the operations return Status that helps you to determine the state of an operation. Also, status contains the error.

Finalize

After you finish working with the operation, session or client, it is better to close it, each of them has a respective method (close()).

Name		Name	Last commit message	Last commit date
parent directory ..
readme.md		readme.md
troubleshooting.md		troubleshooting.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

Getting started

Table of Contents

Foreword

Example

Error handling

HiveSession

HiveOperation

HiveUtils

Example

Status

Finalize

FilesExpand file tree

docs

Directory actions

More options

Directory actions

More options

Latest commit

History

docs

Folders and files

parent directory

readme.md

Getting started

Table of Contents

Foreword

Example

Error handling

HiveSession

HiveOperation

HiveUtils

Example

Status

Finalize