App Insights continuous export to archive and ad-hoc queries

In this article I wan to share how we minimized our AppInsights costs for archived data to about 10% of original costs and yet we have still the ability to query the data ad-hoc.

Continuous export

Because of various legal requirements we are obligated to keep our App Insights history for quite longer period than the 90 days offered by AppInsights “classic”. We can switch our account to AppInsights “workspaces” where you can specify the storage retention…but it comes with a price. A price that is 2.64 EUR per GB for a 2 years retention. Now imagine we have in average about 6 TB of data ingested per month from all of our AppInsights accounts. That is 144 TB of data after 2 years … and an extra bill for data retention of something around 16 000 EUR / month (based on AppInsights retention pricing 0.110 EUR/GB/month).

OK… but we do not need to query those data daily. In fact in last few years it was less than five times we need to check something in the logs.

Thanks to continuous export we were able reduce the costs for the keeping the archive for App Insights.

How it works? You simple configure few settings via azure portal or CLI/PowerShell and some pipeline in Azure you don’t need to take care of will start ingesting AppInsights data to a selected azure blob storage container. You can configure what will be exported (e.g. messages, requests…), but that’s all. Format of the files is JSON and folder structure like /appInsightsAccount/type/yyyy-MM-dd/HH/guid_ID.blob.

Costs? For first generation of blob storage the 2 years retention had costs us about 2 300 EUR/month.

The second generation of blob storage offers three tiers: hot, cool and archive.

The last one seemed to be perfect for our needs and it also cut down the costs to only about 220 EUR/month for 2 years of data retention (144 TB).


Storage configuration

How we have achieved automatic tier selection? First we configured default tier for blobs to be “cool”.

Azure Blob Configuration Default Tier as Cool
Azure Blob Configuration Default Tier as Cool

Unfortunately as you can see you can not specify “archive” tier to be the default one. However you can define a life cycle policy that will cover this…

Azure Blob Configuration Lifecycle
Azure Blob Configuration Lifecycle
{ "rules": [{
"enabled": true,
"name": "ReduceSpending",
"type": "Lifecycle",
"definition": {
"actions": {
"baseBlob": {
"tierToCool": { "daysAfterModificationGreaterThan": 1},
"tierToArchive": { "daysAfterModificationGreaterThan": 30 },
"delete": { "daysAfterModificationGreaterThan": 800 }}
},"filters": {"blobTypes": ["blockBlob"]}}}]}

Azure Blob Storage Archive tier re-hydration

Having the data in archive tier have also some caveats. For instance: you do not have direct access to your data. You can not simply download a blob or connect it via data factory.

To access your data you need to re-hydrate it first. And it sort of makes sense. Files stored as archive are stored probably on some tape library like this. So before you can read them the tape needs to put to a reader, rewind to proper location and data needs to be written to some HDD (cool) or SSD (hot) storage. This is called re-hydration and it will costs you 0.0203 EUR/GB and ~5 EUR/10000 read operations.

Keeping the above in mind, retrieval of one day of archived App Insights data (e.g. messages) will cost you 2 880 operations and ~50 GB of data (example for our case, retrieving just traces/messages for a selected region), that is about 2.5 EUR for re-hydration of messages per one day.

Re-hydration and preparing ~1.5 TB of data of selected region and one month of period for ad-hoc query would costs us about 75 EUR.

# Sample PowerShell used to re-hydrate archive
# Please create the target azure storage prior of this script
# You might want to choose gen2 and enable data lake, see further in the article about querying
Set-AzContext -SubscriptionName XXX
$ctx = Get-AzStorageAccount -ResourceGroupName XXX -Name XXX | select -ExpandProperty Context
Get-AzStorageBlob -Context $ctx -Container XXX -blob 'ainame_xxx/Messages/2019-03-*/*' | select -ExpandProperty Name |% {
Start-AzStorageBlobCopy -SrcContainer XXX -SrcBlob $_ -DestContainer XXX -DestBlob $_ -StandardBlobTier Hot -RehydratePriority Standard -Context $ctx }


Ad-hoc queries

Once we have our archive data re-hydrated in some new azure storage, you may want to either ingest it or attach it as external tableto an existing or new Azure Data Explorer database.

Let’s assume you have existing instance something like “aionlineadhocqueries” with one existing database with some short period of data retention:

Data Explorer Database
Data Explorer Database

Prepare the table and mapping

First you need to ingest one blob to prepare the target table + setup mapping.

  1. Start ad-hoc ingestion
  2. Prepare a single sample blob with a
    sample JSON data from rehydrated source. Blob account can’t be data lake
    gen2 as it does not support the SAS token. So in a case you have re-hydrated your source data to a data lake gen2 storage please copy & paste one file from it to some new or existing temporary blob account. Don’t worry, it’s a one time obstacle
  3. Select the proper database you have created (e.g. AdHocQueries) and a proper name for new table, let’s use “messages201903” in our examples below
  4. Include link to the single sample blob including the SAS token
  5. Continue to “Edit schema”, see screenshot below: e.g. switch to JSON, increase JSON levels,…
  6. Don’t forget to copy the mapping name
  7. Start ingestion! 🙂 
AzureData Explorer Adhoc Ingest: Edit Schema
AzureData Explorer Adhoc Ingest: Edit Schema

After a few moment you should be able to see the table created and populated with data via

You can check the table structure and data with a simple “messages201903 | take 10” kusto query. You can check the mapping created with a query like “.show table messages201903 ingestion mappings”.


Option A: ingest data via LightIngest

Now we need to ingest rest of the data from the source blob storage. We will use LightIngest to do so. Documentation: Package:
#Run via PowerShell:
Install-Package -Name Microsoft.Azure.Kusto.Tools -ProviderName NuGet; cd (Split-Path (Get-Package -Name Microsoft.Azure.Kusto.Tools).Source); cd tools .LightIngest.exe ";fed=true" -database:AdHocQueries -table:messages201903 -source:";ocJcdoPh72yx.......PRIMARY/SECONDARY..KEY==" -pattern:"*.blob" -format:json -ingestionMappingRef:AppInsightsMessages_mapping
Azure Data Explorer Light Ingest execution
Azure Data Explorer Light Ingest execution


Option B: use the external table

Alternatively you may
 1) Prepare the external table
 2) Prepare the external table mapping
 3) Instead messages201903 you need to use external_table(“messages201903”)

Prepare the external table

Right click in the data explorer and prepare the one-time ingested table schema:

Azure Data Explorer Copy Table Schema
Azure Data Explorer Copy Table Schema

Run following in data explorer:

.create-or-alter external table messages201903Ext (****original columns ***) 
dataformat=json ( h@'abfss://;Fxl5WPRIMARYorSECONDARYkey********************==' )
with (fileExtension = ".blob")

Example with table schema: 

.create-or-alter external table messages201903Ext (message: dynamic, internal_data_id: guid, internal_data_documentVersion: real, context_application_version: string, context_data_eventTime: datetime) kind=adl…

Prepare the external table mapping

Now we need to create a mapping for the external table. We can utilize the existing mapping already created via the one-time-ingested table:

.show table messages201903 ingestion mappings

Create the schema based on the above command, e.g. 

.create external table messages201903Ext json mapping "messages201903Ext_Mapping" 


(note: you need to replace ‘ to ‘ from the original schema output)

Query the data

You may use web interface or rich desktop client:

Download and install the Kusto.Explorer tool from

Now you can use Kusto language the same way as you are used to query App Insights, e.g. 

| summarize count() by bin(context_data_eventTime, 1h)
| render timechart 


Leave a Reply

Blog at

%d bloggers like this: