Prévia do material em texto
Writing SQL Queries against Excel files (Excel SQL) Follow Note: QuerySurge 6.3 and below connect to Excel using Java's built-in JDBC/ODBC bridge along with the Microsoft's Excel ODBC driver. This article describes the query syntax for this JDBC/ODBC bridge-based approach. Since QuerySurge 6.3, QuerySurge has shipped with its own proprietary JDBC driver for Excel. Users are strongly urged to use this all-Java JDBC driver for Excel data, as Java's JDBC/ODBC bridge has been removed in Java 8 and above. Details for the new QuerySurge JDBC Driver for Excel are available in this article. Note: Connection to Excel via Java's builtin ODBC/JDBC bridge feature (on which this connection depends) is being deprecated starting with the QuerySurge 6.4 release, since Java has removed the bridge feature. While QuerySurge 6.4 is backwards- compatible for your existing Excel queries, you should start planning to move your existing queries to the QuerySurge Excel JDBC Driver. The current date for sunset of backwards-compatibility is October 31, 2019. For more information, see our FAQ. Microsoft Excel handles SQL via its own SQL dialect. The tutorial below provides an introduction to the syntax for SQL querying against Excel files, focusing on common approaches when pulling data from Excel as a QuerySurge Source or Target. Excel SQL Examples 1. Simple SELECT *. Create a query that selects all rows and columns from the Excel file. SELECT * FROM [SALES$]; In this example, the query fetches all rows and columns in the SALES sheet. Note the syntax for the table name in the FROM clause is: [SHEETNAME$] (using enclosing square brackets and a dollar sign after the sheet name). You can query against different sheets in an Excel file using this syntax. 2. Simple SELECT. Create a query that selects specific columns from the Excel file. SELECT [QUANTITY],[NAME],[PRICE] FROM [SALES$]; In this example, we specify the columns we would like the query to return. Note the syntax for the column names in the SELECT clause is: [COLUMN1], [COLUMN2]. https://querysurge.zendesk.com/hc/en-us/articles/205766136-Writing-SQL-Queries-against-Excel-files-Excel-SQL-/subscription.html https://querysurge.zendesk.com/knowledge/articles/360007800591/ https://querysurge.zendesk.com/hc/en-us/articles/360020869591 3. WHERE clause. Use a WHERE clause in your query to filter your Excel data. SELECT [QUANTITY],[NAME],[PRICE] FROM [SALES$] WHERE [SALE_ID] >= 23 AND [SALE_ID] <= 28; In this example, we limit our result set to records whose [SALE_ID] is >= 23 and < 28. The syntax for column names in the WHERE clause uses square brackets, as we saw previously: [COLUMN1]. 4. Cell Ranges. Limit your query to a specific cell range. SELECT [QUANTITY],[NAME],[PRICE] FROM [SALES$A1:E101]; In this example, we do not impose any limitations on the values themselves. However, we direct the query to look only at a Range of cells (A1 through E101). Note that the cell range is specified after the dollar sign in the table name, using the colon between the first cell and the final cell in the range. 5. Filter by date syntax. Create a query that filters by date with an ORDER BY a. SELECT [SALE_ID],[SALE_DATE],[QUANTITY],[NAME],[PRICE] FROM [SALES$A1:E101] WHERE [SALE_DATE] = #12/6/2003# ORDER BY [SALE_ID]; b. SELECT [SALE_ID],[SALE_DATE],[QUANTITY],[NAME],[PRICE] FROM [SALES$A1:E101] WHERE [SALE_DATE] BETWEEN #12/6/2003# AND #1/1/2008# ORDER BY [SALE_ID]; In these examples, the query limits the result set to records where the [SALE_DATE] is equal to 12/6/2003 (example a) or between 12/6/2003 and 1/1/2008 (example b). Note the syntax for specifying date values is #M/d/yyyy#, using both leading and trailing hash symbols. Additionally, we order the records by their [SALE_ID] in the ORDER BY clause. The syntax for column names in the ORDER BY clause is (as before): [COLUMN1]. 6. String functions: MID. Create a query using a string function in the WHERE clause. SELECT [SALE_ID],[SALE_DATE],[QUANTITY],[NAME],[PRICE] FROM [SALES$A1:E101] WHERE [SALE_DATE] BETWEEN #12/6/2003# AND #1/1/2008# AND MID([NAME],1,4) = 'NYNY' ORDER BY [SALE_ID]; In this example, we have demonstrated the use of the VBA MID() function. A listing of String functions is at: https://msdn.microsoft.com/en- us/library/dd789093.aspx. 7. Data Type Conversions. Create a query showing how to do a type conversion. SELECT [ID],CLNG([OID]),[DESC],[SALE_DATE] FROM [PRODUCTS$]; The second column has a type conversion – to the LONG type. VBA type conversion functions include the following: Type Conversion Function Return Type (Java.sql.Types) CBool(col_name) SMALLINT CByte(col_name) SMALLINT CDate(col_name) TIMESTAMP CDbl(col_name) DOUBLE CInt(col_name) SMALLINT CLng(col_name) INTEGER CSng(col_name) DOUBLE CStr(col_name) VARCHAR 8. Note: CDate() does not recognize newer date formats, such as the yyyy-MM- ddTHH:mm:ss format. However, it does recognize the yyyy-MM-dd HH:mm:ss format. You can use the REPLACE function to change the ‘T’ to a space, to get a format that is recognized: CDate(REPLACE([date_col], 'T', '') 9. Date formatting. Create a query formatting a date column. SELECT FORMATDATETIME( [SALE_DATE], 1), FORMAT([SALE_DATE], 'yyyy-mm-dd') FROM [SALES$] WHERE [SALE_DATE] BETWEEN #12/6/2003# AND #1/1/2008#; The FORMATDATETIME function has a set of fixed options for formatting, shown in https://msdn.microsoft.com/en-us/library/dd789093.aspx https://msdn.microsoft.com/en-us/library/dd789093.aspx the following table. Format Code Description 0 Display a date and/or time. Date parts are displayed in short date format. Time parts are displayed in long time format. 1 Display a date using the long date format specified in your computer's regional settings. 2 Display a date using the short date format specified in your computer's regional settings. 3 Display a time using the time format specified in your computer's regional settings. 4 Display a time using the 24-hour format (hh:mm). 10. If you need a more flexible set of formatting options, the FORMAT function takes a format template string: Format Code Description d Display the day as a number without a leading zero (1 – 31). dd Display the day as a number with a leading zero (01 – 31). ddd Display the day as an abbreviation (Sun – Sat). m Display the month as a number without a leading zero (1 – 12). If m immediately follows h or hh, the minute rather than the month is displayed. mm Display the month as a number with a leading zero (01 – 12). If m immediately follows h or hh, the minute rather than the month is displayed. mmm Display the month as an abbreviation (Jan – Dec). mmmm Display the month as a full month name (January – December). oooo The same as mmmm, only it's the localized version of the string. y Display the day of the year as a number (1 – 366). yy Display the year as a 2-digit number (00 – 99). yyyy Display the year as a 4-digit number (100 – 9999). 11. See https://msdn.microsoft.com/en-us/library/office/gg251755.aspx for a listing of format options. 12. Number formatting. Create a query showing how to format a number. SELECT [QUANTITY],[NAME],format([PRICE],'###.000') FROM [SALES$] WHERE [SALE_DATE] BETWEEN #12/6/2003# AND #1/1/2008#; https://msdn.microsoft.com/en-us/library/office/gg251755.aspx The [PRICE] column is formatted to three decimal places. Note that the formatting code must be enclosed in single quotes, not double quotes. Note also that the format function returns a string type, so if you want to do math with a formatted value, you’ll have to cast it to a numerical type with a type conversion function. The table below shows a variety of formattingexamples. Format Explanation Example Output 0 Always display a digit including leading/trailing 0; Rounds if necessary Format(31.4159, '000.00000') 031.41590 0 - Format(31.4159, '00.000') 31.416 # Display digit or for leading/trailing 0, omit the 0 digit; Rounds if necessary Format(31.4159, '###.#####') 31.4159 # - Format(31.4159, '##.####0') 31.41590 % Fraction expressed as percent Format(0.4159, '##%') 42% E- E+ Scientific notation Format(3.14159, '####E-##') 3142E-3 - + $ () Include a literal character Format(31.4159, '$####.00') $31.42 \ Escape a special character Format(0.4159, '0.###\%') 0.416% 13. 14. Conditionals. Create a query showing how to conditionally modify results. In many SQL dialects, this is done with a CASE statement. In Excel SQL, this is done with the IIf() function. The function signature is: IIf (expr, truepart, falsepart). If expr evaluates to true, then truepart is returned, otherwise, falsepart is returned. The following example returns the string 'SPECIAL!' if the price is less than $1.00; otherwise it returns the actual price. SELECT [QUANTITY],[NAME], IIf([PRICE] < 1.00,'SPECIAL!',[PRICE]) FROM [SALES$] 15. Null Handling. Under some conditions, such as when a cell is has no data, Excel returns a Null value. You can test whether a cell is null using the IsNull(expr) IsNull() returns a Boolean true value (=-1) if the argument is null, and a Boolean false (=0) if the argument is not null. A simple example is: SELECT IsNull([QUANTITY]) FROM [SALES$] This function can be combined with the IIf() syntax to return a specific value in cases where a null is found in a column, and the actual column value where the value is not null. Note, in the following example, that the value that we return when we find a null is the second argument – the truepart argument, because this is returned when IsNull() returns true . SELECT IIf(IsNull([PRICE]), 'NULL FOUND!', [PRICE]) FROM [SALES$] For the following data, the query will return the price in the column for ID = 1, 2, 3; for ID = 4, the query returns ‘Null Found!’. ID PRICE 1 10.02 2 102.35 3 9424.23 4 Null Found!