Archive for the ‘Big Data’ Category

Quick Tip: Get the Size of MS SQL Server Table in 1 Line

sp_spaceused ‘[dbo].[table]’

MS SQL Server: Get Size of All Tables in a DB

SELECT 
    s.Name AS SchemaName,
    t.NAME AS TableName,
    p.rows AS RowCounts,
    SUM(a.total_pages) * 8 AS TotalSpaceKB, 
    SUM(a.used_pages) * 8 AS UsedSpaceKB, 
    (SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB
FROM 
    sys.tables t
INNER JOIN 
    sys.schemas s ON s.schema_id = t.schema_id
INNER JOIN      
    sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN 
    sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN 
    sys.allocation_units a ON p.partition_id = a.container_id
WHERE 
    t.NAME NOT LIKE 'dt%' 
    AND t.is_ms_shipped = 0
    AND i.OBJECT_ID > 255 
GROUP BY 
    t.Name, s.Name, p.RowsORDER BY 
    s.Name, t.Name

Quick Tip: Check Last Update Time of a SQL Table (Note: requires permission)

SELECT OBJECT_NAME (OBJECT_ID) AS DatabaseName, last_user_update,*
FROM sys .dm_db_index_usage_stats
WHERE database_id = DB_ID( ‘TomasTest’)
AND OBJECT_ID =OBJECT_ID( ‘dbo.forecast’)

HiveQL: SQL Connector

 
To bridge the difference between SQL and HiveQL, we have the SQL Connector
feature to translate standard SQL-92 queries into equivalent HiveQL queries. The SQL
Connector performs syntactical translations and structural transformations. For
example:
1. Quoted Identifiers
HiveQL uses back-quote while SQL uses double quote when quoting identifiers.
Even when a driver reports the back-quote as the quote character, some
applications still generate double quoted identifiers.
2. Table Aliases
HiveQL does not support the AS keyword between a table reference and its
alias.
3. JOIN, INNER JOIN and CROSS JOIN
SQL INNER JOIN and CROSS JOIN syntax is translated to HiveQL JOIN syntax.
4. TOP N/LIMIT
SQL TOP N queries are transformed to HiveQL LIMIT queries.

MySQL v. MS SQL Server Information Schema Queries

In MySQL, you would run the following to get information on tables in a database:
SELECT * FROM information_schema.tables
WHERE table_schema = ‘YourDatabaseName’;
In MS SQL Server, the same would be:

USE YourDatabaseName
SELECT * FROM INFORMATION_SCHEMA.tables;
– See more at: http://dbadiaries.com/querying-information_schema#sthash.aQp6qbaP.dpuf

Quick Tip: Scaling in Hbase

Everything scales — hence autosharding, notifications to ZooKeeper, etc

So you don’t have to do anything to make it distributed/scalable

Hbase: Disable and Drop

Disable and drop your table to clean up:

hbase(main):012:0> disable 'test'
0 row(s) in 1.0930 seconds
hbase(main):013:0> drop 'test'
0 row(s) in 0.0770 seconds 

Exit the shell by typing exit.

hbase(main):014:0> exit

Quick Tip: HBase Get (a simple example)

hbase(main):009:0> get  ‘test’, ‘row1’

COLUMN                CELL
 cf:a                 timestamp=1381963930588, value=value1

1 row(s) in 0.0190 seconds

Quick Tip: HBase Scan (a simple example)

hbase(main):008:0> scan ‘test’
ROW                   COLUMN+CELL
 row1                 column=cf:a, timestamp=1381963930588, value=value1
 row2                 column=cf:b, timestamp=1381963944569, value=value2
 row3                 column=cf:c, timestamp=1381963957538, value=value3

3 row(s) in 0.1270 seconds

Quick Tip: How to Connect to a Running Session (shell) in HBase in 1 Line

./bin/hbase shell