Datadog helps detect 'problem' Spark and Databricks jobs – Blocks and Files

Cloud monitoring and security firm Datadog has introduced Data Jobs Monitoring, which allows teams to detect problematic Spark and Databricks jobs anywhere in their data pipelines. It also allows them to remediate failed and long-running-jobs faster, and optimize over-provisioned compute resources to reduce costs, promises the provider.

Jobs Monitoring is said to immediately surface specific jobs that need optimization and reliability improvements, while enabling teams to drill down into job execution traces so that they can correlate their job telemetry to their cloud infrastructure for “fast debugging.”

On the technology, Matt Camilli, head of engineering at Rhythm Energy, said: “My team is able to resolve our Databricks job failures 20 percent faster, because of how easy it is to set up real-time alerting and find the root cause of the failing job.”

“When data pipelines fail, data quality is impacted, which can hurt stakeholder trust and slow down decision making,” added Michael Whetten, VP of product at Datadog. “Data Jobs Monitoring gives data platform engineers full visibility into their largest, most expensive jobs, to help them improve data quality, optimize their pipelines and prioritize cost savings.”

Out-of-the-box alerts immediately notify teams when jobs have failed or are running beyond automatically detected baselines, so this can be addressed before there are negative impacts to the end user experience. And recommended filters in Jobs Monitoring surface the most important issues that are impacting job and cluster health, so that they can be prioritized.

In addition, detailed trace views show teams exactly where a job failed in its execution flow, so they have the full context for faster troubleshooting. Also, multiple job runs can be compared to one another to expedite root cause analysis, and identify trends and changes in run duration, Spark performance metrics, cluster utilization and configuration.

Finally, resource utilization and Spark application metrics help teams identify ways to lower compute costs for over-provisioned clusters and optimize inefficient job runs.

A Gartner magic quadrant named the leading observability and APM vendors in 2023 as Dynatrace, Datadog, New Relic, Splunk, and Honeycomb. There were 14 other vendors mentioned in the MQ.

Datadog helps detect ‘problem’ Spark and Databricks jobs – Blocks and Files

Must read

Lily Allen ‘splits from actor David Harbour’ as she’s spotted on dating app

Blake Lively vs Justin Baldoni: Wetin Lively accuse ‘It Ends With Us’ co-star of sexual harassment dey about – BBC News Pidgin

Charting Matt Gaetz’s swift rise to power — and spectacular fall from grace

Matt Gaetz accused of paying for sex and using drugs by US congressional panel

Latest article

Lily Allen ‘splits from actor David Harbour’ as she’s spotted on dating app

Blake Lively vs Justin Baldoni: Wetin Lively accuse ‘It Ends With Us’ co-star of sexual harassment dey about – BBC News Pidgin

Charting Matt Gaetz’s swift rise to power — and spectacular fall from grace

Matt Gaetz accused of paying for sex and using drugs by US congressional panel

Burt, the crocodile from ‘Crocodile Dundee,’ dies in reptile habitat; estimated over 90 years old

About Us

Popular Category

Latest News

Lily Allen ‘splits from actor David Harbour’ as she’s spotted on dating app

Blake Lively vs Justin Baldoni: Wetin Lively accuse ‘It Ends With Us’ co-star of sexual harassment dey about – BBC News Pidgin