Azure Cosmos DB

Cosmos DB is a distributed database engine with core features provided for any type of implementation model.

Features of Cosmos DB

  • Turnkey global distribution

Cosmos DB enables global data distribution and availability as a configuration setting in the portal, via command-line or ARM template, making data replication to a new location within the chosen region as seamless as possible. Both manual and automatic failover is supported as well as multi-read and multi-write from primary and replica databases.

  • Elastic storage and throughput

Cosmos DB will automatically scale database storage and throughput in a pay for consumption based model. There is no need to pre-provision resources to account to future growth. Cosmos DB measures throughput in a standardized way referred to as Request Units (RUs) and can be considered as an abstraction of physical resources. RUs are provisioned per second, eg. 2000 RU/s.

Throughput is provisioned at a database or container level.

Container LevelDatabase Level
Isolated throughputContainers share throughput
  • Low latency

Microsoft’s financially back SLA provides performance metrics for read and write requests < 10 ms 99% of the time.

  • Flexible consistency model

Data replication options are available over 5 sliding scale consistency models to optimize the database for a specific workload. Consistency can be configured globally per connection.

Credit : https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • Enterprise-grade security

A unified security model exists across all APIs, providing built-in encryption at rest and in-transit. IP-based access control is supported.

To connect to a Cosmos DB, 2 pairs of keys, read-write and read-only are used and managed by the service to control access to the account and data.

APIs

Cosmos DB exposes data through a variety of models and APIs. When you request data using a specific API, Cosmos DB will automatically handle the translation of data from the underlying data format to the data model required for the API.

APIDescription
SQL APICore API with many unique features.

Supports JavaScript logic and SQL queries.
MongoDB APICompatible with MongoDB v3.2 protocol.

Supports aggregation pipeline.
Gremlin APICompatible with the Apache TinkerPop graph traversal language (Gremlin).

Returns results in GraphSON (extended JSON) format.
Table APIService-level compatibility with Azure Storage Tables.

Migrate applications with no code changes.
Cassandra APISupports Cassandra Query Language (CQL) v4 protocol.

Works out of the box with CQL shell.
etcd APIImplements etcd wire protocol.

Can be used as a backing store for Azure Kubernetes Service.

Resource Model

Data in Azure Cosmos DB is stored in a hierarchy of resources.

Indexing

Cosmos DB automatically indexes all fields within all items or documents by default. While indexing can be useful for many workloads, indexing all fields and items can have a performance impact on more complex data sets.

Performance optimization to control and tune indexing is possible to balance trade-offs between write and query performance.

Index policies can be created to configure indexes by specifying the following:

  • List of paths to index
  • Different types of indexing to perform
  • List of paths to exclude

Types of indexes

RangeHashSpatial
Provides comparison functionalityQuick lookup for exact match informationUsed for geographical information

Azure Event Grid

Azure Event Grid is a managed event routing service that enables standardized event consumption using a publish-subscribe model.

An event is something that has occurred and is limited to 64KB in Azure. Some examples include

  1. A new client has signed up with your organization
  2. A client has initiated a payment that needs to take a specific clearing route

Azure Event Grid supports a number of event sources, an event source is where the event has taken place. Looking at the examples above, these events could have taken place in

  1. Customer Relationship Management system
  2. A digital banking channel

Generally, publishers of events send information on a specific end-point or topic and may choose to have an individual topic or multiple topics.

An event subscription is the mechanism that routes events to multiple handlers and subscribers. Subscriptions are also used by handlers to intelligently filter incoming events.

Event handlers is the application or service that processed the event eg. Azure Functions, Event Hubs, Azure Logic Apps or Webhooks.

Various authentication types are provided to support Event Grid such as Webhook event delivery, Event subscriptions & custom topic publishing. RBAC & various action types are also supported to manage and control authorization.

With Webhooks, you can include additional parameters for security such as a secret or an access token that is passed in as a query string. Only HTTPS endpoints are supported.

Custom topics support two types of authentication mechanisms, either a secret key or a shared access signature.

Benefits

  1. Simple and powerful with easy configuration
  2. The ability to filter on event types or event publish paths
  3. A single endpoint can subscribe to many events
  4. A single endpoint can publish multiple copies to many subscribers
  5. Can accommodate high throughput (millions per second)
  6. Consumption based model – pay per event
  7. Reliable with 24-hour retry capability and exponential backoff
  8. Many built-in event types
  9. The flexibility to create custom events

Potential Architectural Patterns

Comparison of messaging services

Event GridEvent HubService Bus
Reactive ProgrammingBig data pipelineHigh-value enterprise messaging
Event distribution (discrete)Event streaming (series)Message
React to changesTelemetry & distributed data streamingOrder processing, financial transactions

Delivery Caveats

  1. Each message is tried at least once for each subscription
  2. Events are sent to the registered endpoint of each subscription immediately
  3. If an endpoint does not acknowledge receipt of an event, Event Grid retries delivery of the event
  4. You can customize the retry schedule
  5. If an event is un-deliverable, it is sent to a storage account which in itself is an event source that you can act on

Introduction to Azure Functions

What are Azure Functions?

Azure Functions is a serverless, cross-platform and open source solution that enables a developer to implement functionality with minimal code on managed infrastructure.

Azure Functions scales dynamically and supports several programming languages such as .NET, Java, Javascript, Python, etc.

Azure Functions should be triggered to execute and can be connected to different sources and targets through bindings.

To host a Function in Azure, a function app is required which will logically group together functions for easier management, deployment, scaling and sharing of resources.

Various hosting plans are available to host a function app:

PlanAdvantagesDisadvantages
Consumption Plan1. Pay only when your functions are executed.

2. Dynamically scale for usage and demand.
1. Cold starts – a brief delay when the function starts executing.
Premium Plan1. Perpetually warm instances.

2. VNet connectivity

3. Unlimited execution duration.

4. Premium instance sizes.
1. Price
Dedicated Plan1. Dedicated VMs

2. Reuse your existing app services.

1. Not serverless

Anatomy of a Function App

Core Files

  • host.json

The host.json file contains global configuration options and will impact all functions within the function app.

{
     "version": "2.0",
     "logging": {
         "applicationInsights": {
              "samplingExcludedTypes": "Request",
              "samplingSettings": {
                   "isEnabled": true
              }
         }
     }
}
  • function.json

The function.json file contains the configuration metadata such as related triggers and bindings for an individual function.

To learn more about the JSON schema for Azure Functions function.json files, check out http://json.schemastore.org/function.

  • local.settings.json

The local.settings.json file contains local configuration settings for your application. Local configuration settings are ignored by Git and Azure. If you’re developing a function app with .Net Core, you can use the IConfiguration infrastructure to easily read environment variables, user secrets and other configuration providers in addition to the local.settings.json file.

{
    "Values": {
        "AzureWebJobsStorage": UseDevelopmentStorage-true",
        "FUNCTIONS_WORKER_RUNTIME": "dotnet"
    }
}

Language and runtime support

Azure Functions can run on both Linux or Windows depending the runtime stack you choose when creating your Function.

LanguageRuntime StackLinuxWindowsPortal editing
C# class library.NET✔️✔️
C#.NET✔️✔️✔️
JavascriptNode.js✔️✔️✔️
PythonPython✔️
JavaJava✔️✔️
PowershellPowershell Core✔️✔️✔️
TypeScriptNode.js✔️✔️
Go/Rust/OtherCustom Handlers✔️✔️