Model Context Protocol (MCP) finally gives AI models a way to access the business data needed to make them really useful at work. CData MCP Servers have the depth and performance to make sure AI has access to all of the answers.
Try them now for free →Automate Parquet Integration Tasks from PowerShell
Are you in search of a quick and easy way to access Parquet data from PowerShell? This article demonstrates how to utilize the Parquet Cmdlets for tasks like connecting to Parquet data, automating operations, downloading data, and more.
The CData Cmdlets for Parquet are standard PowerShell cmdlets that make it easy to accomplish data cleansing, normalization, backup, and other integration tasks by enabling real-time access to Parquet.
PowerShell Cmdlets or ADO.NET Provider?
The Cmdlets are not only a PowerShell interface to Parquet, but also an SQL interface; this tutorial shows how to use both to retrieve Parquet data. We also show examples of the ADO.NET equivalent, which is possible with the CData ADO.NET Provider for Parquet. To access Parquet data from other .NET applications, like LINQPad, use the CData ADO.NET Provider for Parquet.
Once you have acquired the necessary connection properties, accessing Parquet data in PowerShell can be enabled in three steps.
Connect to your local Parquet file(s) by setting the URI connection property to the location of the Parquet file.
PowerShell
-
Install the module:
Install-Module ParquetCmdlets
-
Connect:
$parquet = Connect-Parquet -URI "$URI"
-
Search for and retrieve data:
$column2 = "SAMPLE_VALUE" $sampletable_1 = Select-Parquet -Connection $parquet -Table "SampleTable_1" -Where "Column2 = `'$Column2`'" $sampletable_1
You can also use the Invoke-Parquet cmdlet to execute SQL commands:
$sampletable_1 = Invoke-Parquet -Connection $parquet -Query 'SELECT * FROM SampleTable_1 WHERE Column2 = @Column2' -Params @{'@Column2'='SAMPLE_VALUE'}
ADO.NET
-
Load the provider's assembly:
[Reflection.Assembly]::LoadFile("C:\Program Files\CData\CData ADO.NET Provider for Parquet\lib\System.Data.CData.Parquet.dll")
-
Connect to Parquet:
$conn= New-Object System.Data.CData.Parquet.ParquetConnection("URI=C:/folder/table.parquet;") $conn.Open()
-
Instantiate the ParquetDataAdapter, execute an SQL query, and output the results:
$sql="SELECT Id, Column1 from SampleTable_1" $da= New-Object System.Data.CData.Parquet.ParquetDataAdapter($sql, $conn) $dt= New-Object System.Data.DataTable $da.Fill($dt) $dt.Rows | foreach { Write-Host $_.id $_.column1 }