openlineage.client.client module

class openlineage.client.client.OpenLineageClientOptions(timeout=5.0, verify=True, api_key=None, adapter=None)

Bases: object

Parameters:
  • timeout (float)

  • verify (bool)

  • api_key (Optional[str])

  • adapter (Optional[HTTPAdapter])

timeout: float
verify: bool
api_key: str
adapter: HTTPAdapter
class openlineage.client.client.OpenLineageClient(url=None, options=None, session=None, transport=None, factory=None)

Bases: object

Parameters:
  • url (str | None)

  • options (OpenLineageClientOptions | None)

  • session (Session | None)

  • transport (Transport | None)

  • factory (TransportFactory | None)

emit(event)
Parameters:

event (Union[RunEvent, DatasetEvent, JobEvent, RunEvent, DatasetEvent, JobEvent])

Return type:

None

classmethod from_environment()
Return type:

_T

classmethod from_dict(config)
Parameters:

config (dict[str, str])

Return type:

_T

filter_event(event)

Filters jobs according to config-defined events

Parameters:

event (Event)

Return type:

Event | None

property config: dict[str, Any]

openlineage.client.event_v2 module

class openlineage.client.event_v2.BaseEvent(*, eventTime, producer='')

Bases: RedactMixin

Parameters:
  • eventTime (str)

  • producer (str)

eventTime: str

the time the event occurred at

producer: str
schemaURL: str
property skip_redact: list[str]
eventtime_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

producer_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

schemaurl_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

class openlineage.client.event_v2.RunEvent(*, eventTime, producer='', run, job, eventType=None, inputs=_Nothing.NOTHING, outputs=_Nothing.NOTHING)

Bases: BaseEvent

Parameters:
  • eventTime (str)

  • producer (str)

  • run (Run)

  • job (Job)

  • eventType (EventType | None)

  • inputs (list[InputDataset] | None)

  • outputs (list[OutputDataset] | None)

run: Run
job: Job
eventType: EventType | None

the current transition of the run state. It is required to issue 1 START event and 1 of [ COMPLETE, ABORT, FAIL ] event per run. Additional events with OTHER eventType can be added to the same run. For example to send additional metadata after the run is complete

inputs: list[InputDataset] | None

The set of input datasets.

outputs: list[OutputDataset] | None

The set of output datasets.

class openlineage.client.event_v2.JobEvent(*, eventTime, producer='', job, inputs=_Nothing.NOTHING, outputs=_Nothing.NOTHING)

Bases: BaseEvent

Parameters:
job: Job
inputs: list[InputDataset] | None

The set of input datasets.

outputs: list[OutputDataset] | None

The set of output datasets.

class openlineage.client.event_v2.DatasetEvent(*, eventTime, producer='', dataset)

Bases: BaseEvent

Parameters:
  • eventTime (str)

  • producer (str)

  • dataset (StaticDataset)

dataset: StaticDataset
openlineage.client.event_v2.RunState

alias of EventType

class openlineage.client.event_v2.Dataset(namespace, name, *, facets=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • namespace (str)

  • name (str)

  • facets (dict[str, DatasetFacet] | None)

namespace: str

The namespace containing that dataset

name: str

The unique name for that dataset within that namespace

facets: dict[str, DatasetFacet] | None

The facets for this dataset

class openlineage.client.event_v2.InputDataset(namespace, name, inputFacets=_Nothing.NOTHING, *, facets=_Nothing.NOTHING)

Bases: Dataset

An input dataset

Parameters:
inputFacets: dict[str, InputDatasetFacet] | None

The input facets for this dataset.

class openlineage.client.event_v2.OutputDataset(namespace, name, outputFacets=_Nothing.NOTHING, *, facets=_Nothing.NOTHING)

Bases: Dataset

An output dataset

Parameters:
outputFacets: dict[str, OutputDatasetFacet] | None

The output facets for this dataset

class openlineage.client.event_v2.Run(runId, facets=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • runId (str)

  • facets (dict[str, RunFacet] | None)

runId: str

The globally unique ID of the run associated with the job.

facets: dict[str, RunFacet] | None

The run facets.

runid_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

class openlineage.client.event_v2.Job(namespace, name, facets=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • namespace (str)

  • name (str)

  • facets (dict[str, JobFacet] | None)

namespace: str

The namespace containing that job

name: str

The unique name for that job within that namespace

facets: dict[str, JobFacet] | None

The job facets.

openlineage.client.event_v2.set_producer(producer)
Parameters:

producer (str)

Return type:

None

openlineage.client.facet module

openlineage.client.facet.set_producer(producer)
Parameters:

producer (str)

Return type:

None

class openlineage.client.facet.BaseFacet

Bases: RedactMixin

property skip_redact: List[str]
class openlineage.client.facet.NominalTimeRunFacet(nominalStartTime, nominalEndTime=None)

Bases: BaseFacet

Parameters:
  • nominalStartTime (str)

  • nominalEndTime (Optional[str])

nominalStartTime: str
nominalEndTime: Optional[str]
class openlineage.client.facet.ParentRunFacet(run, job)

Bases: BaseFacet

Parameters:
  • run (Dict[Any, Any])

  • job (Dict[Any, Any])

run: Dict[Any, Any]
job: Dict[Any, Any]
classmethod create(runId, namespace, name)
Parameters:
  • runId (str)

  • namespace (str)

  • name (str)

Return type:

ParentRunFacet

class openlineage.client.facet.DocumentationJobFacet(description)

Bases: BaseFacet

Parameters:

description (str)

description: str
class openlineage.client.facet.SourceCodeLocationJobFacet(type, url)

Bases: BaseFacet

Parameters:
  • type (str)

  • url (str)

type: str
url: str
class openlineage.client.facet.SqlJobFacet(query)

Bases: BaseFacet

Parameters:

query (str)

query: str
class openlineage.client.facet.DocumentationDatasetFacet(description)

Bases: BaseFacet

Parameters:

description (str)

description: str
class openlineage.client.facet.SchemaField(name, type, description=None)

Bases: RedactMixin

Parameters:
  • name (str)

  • type (str)

  • description (Optional[str])

name: str
type: str
description: Optional[str]
class openlineage.client.facet.SchemaDatasetFacet(fields)

Bases: BaseFacet

Parameters:

fields (List[SchemaField])

fields: List[SchemaField]
class openlineage.client.facet.DataSourceDatasetFacet(name, uri)

Bases: BaseFacet

Parameters:
  • name (str)

  • uri (str)

name: str
uri: str
class openlineage.client.facet.OutputStatisticsOutputDatasetFacet(rowCount=None, size=None, fileCount=None)

Bases: BaseFacet

Parameters:
  • rowCount (Optional[int])

  • size (Optional[int])

  • fileCount (Optional[int])

rowCount: Optional[int]
size: Optional[int]
fileCount: Optional[int]
class openlineage.client.facet.ColumnMetric(nullCount=None, distinctCount=None, sum=None, count=None, min=None, max=None, quantiles=None)

Bases: object

Parameters:
  • nullCount (Optional[int])

  • distinctCount (Optional[int])

  • sum (Optional[int])

  • count (Optional[int])

  • min (Optional[float])

  • max (Optional[float])

  • quantiles (Optional[Dict[str, float]])

nullCount: Optional[int]
distinctCount: Optional[int]
sum: Optional[int]
count: Optional[int]
min: Optional[float]
max: Optional[float]
quantiles: Optional[Dict[str, float]]
class openlineage.client.facet.DataQualityMetricsInputDatasetFacet(rowCount=None, bytes=None, fileCount=None, columnMetrics=_Nothing.NOTHING)

Bases: BaseFacet

Parameters:
  • rowCount (Optional[int])

  • bytes (Optional[int])

  • fileCount (Optional[int])

  • columnMetrics (Dict[str, ColumnMetric])

rowCount: Optional[int]
bytes: Optional[int]
fileCount: Optional[int]
columnMetrics: Dict[str, ColumnMetric]
class openlineage.client.facet.Assertion(assertion, success, column=None)

Bases: RedactMixin

Parameters:
  • assertion (str)

  • success (bool)

  • column (Optional[str])

assertion: str
success: bool
column: Optional[str]
class openlineage.client.facet.DataQualityAssertionsDatasetFacet(assertions)

Bases: BaseFacet

This facet represents asserted expectations on dataset or it’s column.

Parameters:

assertions (List[Assertion])

assertions: List[Assertion]
class openlineage.client.facet.SourceCodeJobFacet(language, source)

Bases: BaseFacet

This facet represents source code that the job executed.

Parameters:
  • language (str)

  • source (str)

language: str
source: str
class openlineage.client.facet.ExternalQueryRunFacet(externalQueryId, source)

Bases: BaseFacet

Parameters:
  • externalQueryId (str)

  • source (str)

externalQueryId: str
source: str
class openlineage.client.facet.ErrorMessageRunFacet(message, programmingLanguage, stackTrace=None)

Bases: BaseFacet

This facet represents an error message that was the result of a job run.

Parameters:
  • message (str)

  • programmingLanguage (str)

  • stackTrace (Optional[str])

message: str
programmingLanguage: str
stackTrace: Optional[str]
class openlineage.client.facet.SymlinksDatasetFacetIdentifiers(namespace, name, type)

Bases: object

Parameters:
  • namespace (str)

  • name (str)

  • type (str)

namespace: str
name: str
type: str
class openlineage.client.facet.SymlinksDatasetFacet(identifiers=_Nothing.NOTHING)

Bases: BaseFacet

This facet represents dataset symlink names.

Parameters:

identifiers (List[SymlinksDatasetFacetIdentifiers])

identifiers: List[SymlinksDatasetFacetIdentifiers]
class openlineage.client.facet.StorageDatasetFacet(storageLayer, fileFormat)

Bases: BaseFacet

This facet represents dataset symlink names.

Parameters:
  • storageLayer (str)

  • fileFormat (str)

storageLayer: str
fileFormat: str
class openlineage.client.facet.OwnershipJobFacetOwners(name, type=None)

Bases: object

Parameters:
  • name (str)

  • type (Optional[str])

name: str
type: Optional[str]
class openlineage.client.facet.OwnershipJobFacet(owners=_Nothing.NOTHING)

Bases: BaseFacet

This facet represents ownership of a job.

Parameters:

owners (List[OwnershipJobFacetOwners])

owners: List[OwnershipJobFacetOwners]
class openlineage.client.facet.JobTypeJobFacet(processingType, integration, jobType)

Bases: BaseFacet

This facet represents job type properties.

Parameters:
  • processingType (str)

  • integration (str)

  • jobType (str)

processingType: str
integration: str
jobType: str
class openlineage.client.facet.DatasetVersionDatasetFacet(datasetVersion)

Bases: BaseFacet

This facet represents version of a dataset.

Parameters:

datasetVersion (str)

datasetVersion: str
class openlineage.client.facet.LifecycleStateChange(value)

Bases: Enum

An enumeration.

ALTER = 'ALTER'
CREATE = 'CREATE'
DROP = 'DROP'
OVERWRITE = 'OVERWRITE'
RENAME = 'RENAME'
TRUNCATE = 'TRUNCATE'
class openlineage.client.facet.LifecycleStateChangeDatasetFacetPreviousIdentifier(name, namespace)

Bases: object

Parameters:
  • name (str)

  • namespace (str)

name: str
namespace: str
class openlineage.client.facet.LifecycleStateChangeDatasetFacet(lifecycleStateChange, previousIdentifier)

Bases: BaseFacet

This facet represents information of lifecycle changes of a dataset.

Parameters:
lifecycleStateChange: LifecycleStateChange
previousIdentifier: LifecycleStateChangeDatasetFacetPreviousIdentifier
class openlineage.client.facet.OwnershipDatasetFacetOwners(name, type)

Bases: object

Parameters:
  • name (str)

  • type (str)

name: str
type: str
class openlineage.client.facet.OwnershipDatasetFacet(owners=_Nothing.NOTHING)

Bases: BaseFacet

This facet represents ownership of a dataset.

Parameters:

owners (List[OwnershipDatasetFacetOwners])

owners: List[OwnershipDatasetFacetOwners]
class openlineage.client.facet.ColumnLineageDatasetFacetFieldsAdditionalInputFields(namespace, name, field)

Bases: RedactMixin

Parameters:
  • namespace (str)

  • name (str)

  • field (str)

namespace: str
name: str
field: str
class openlineage.client.facet.ColumnLineageDatasetFacetFieldsAdditional(inputFields, transformationDescription, transformationType)

Bases: object

Parameters:
inputFields: ClassVar[List[ColumnLineageDatasetFacetFieldsAdditionalInputFields]]
transformationDescription: str
transformationType: str
class openlineage.client.facet.ColumnLineageDatasetFacet(fields=_Nothing.NOTHING)

Bases: BaseFacet

This facet contains column lineage of a dataset.

Parameters:

fields (Dict[str, ColumnLineageDatasetFacetFieldsAdditional])

fields: Dict[str, ColumnLineageDatasetFacetFieldsAdditional]
class openlineage.client.facet.ProcessingEngineRunFacet(version, name, openlineageAdapterVersion)

Bases: BaseFacet

Parameters:
  • version (str)

  • name (str)

  • openlineageAdapterVersion (str)

version: str
name: str
openlineageAdapterVersion: str
class openlineage.client.facet.ExtractionError(errorMessage, stackTrace, task, taskNumber)

Bases: BaseFacet

Parameters:
  • errorMessage (str)

  • stackTrace (Optional[str])

  • task (Optional[str])

  • taskNumber (Optional[int])

errorMessage: str
stackTrace: Optional[str]
task: Optional[str]
taskNumber: Optional[int]
class openlineage.client.facet.ExtractionErrorRunFacet(totalTasks, failedTasks, errors)

Bases: BaseFacet

Parameters:
totalTasks: int
failedTasks: int
errors: List[ExtractionError]

openlineage.client.facet_v2 module

class openlineage.client.facet_v2.BaseFacet(*, producer='')

Bases: RedactMixin

all fields of the base facet are prefixed with _ to avoid name conflicts in facets

Parameters:

producer (str)

property skip_redact: list[str]
class openlineage.client.facet_v2.DatasetFacet(*, producer='', deleted=None)

Bases: BaseFacet

A Dataset Facet

Parameters:
  • producer (str)

  • deleted (bool | None)

class openlineage.client.facet_v2.InputDatasetFacet(*, producer='')

Bases: BaseFacet

An Input Dataset Facet

Parameters:

producer (str)

class openlineage.client.facet_v2.JobFacet(*, producer='', deleted=None)

Bases: BaseFacet

A Job Facet

Parameters:
  • producer (str)

  • deleted (bool | None)

class openlineage.client.facet_v2.OutputDatasetFacet(*, producer='')

Bases: BaseFacet

An Output Dataset Facet

Parameters:

producer (str)

class openlineage.client.facet_v2.RunFacet(*, producer='')

Bases: BaseFacet

A Run Facet

Parameters:

producer (str)

openlineage.client.facet_v2.set_producer(producer)
Parameters:

producer (str)

Return type:

None

openlineage.client.filter module

class openlineage.client.filter.Filter

Bases: object

filter_event(event)
Parameters:

event (RunEventType)

Return type:

RunEventType | None

class openlineage.client.filter.ExactMatchFilter(match)

Bases: Filter

Parameters:

match (str)

filter_event(event)
Parameters:

event (RunEventType)

Return type:

RunEventType | None

class openlineage.client.filter.RegexFilter(regex)

Bases: Filter

Parameters:

regex (str)

filter_event(event)
Parameters:

event (RunEventType)

Return type:

RunEventType | None

openlineage.client.filter.create_filter(conf)
Parameters:

conf (dict[str, str])

Return type:

Filter | None

openlineage.client.run module

class openlineage.client.run.RunState(value)

Bases: Enum

An enumeration.

START = 'START'
RUNNING = 'RUNNING'
COMPLETE = 'COMPLETE'
ABORT = 'ABORT'
FAIL = 'FAIL'
OTHER = 'OTHER'
class openlineage.client.run.Dataset(namespace, name, facets=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • namespace (str)

  • name (str)

  • facets (Dict[Any, Any])

namespace: str
name: str
facets: Dict[Any, Any]
class openlineage.client.run.InputDataset(namespace, name, facets=_Nothing.NOTHING, inputFacets=_Nothing.NOTHING)

Bases: Dataset

Parameters:
  • namespace (str)

  • name (str)

  • facets (Dict[Any, Any])

  • inputFacets (Dict[Any, Any])

inputFacets: Dict[Any, Any]
class openlineage.client.run.OutputDataset(namespace, name, facets=_Nothing.NOTHING, outputFacets=_Nothing.NOTHING)

Bases: Dataset

Parameters:
  • namespace (str)

  • name (str)

  • facets (Dict[Any, Any])

  • outputFacets (Dict[Any, Any])

outputFacets: Dict[Any, Any]
class openlineage.client.run.DatasetEvent(eventTime, producer, schemaURL, dataset)

Bases: RedactMixin

Parameters:
  • eventTime (str)

  • producer (str)

  • schemaURL (str)

  • dataset (Dataset)

eventTime: str
producer: str
schemaURL: str
dataset: Dataset
class openlineage.client.run.Job(namespace, name, facets=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • namespace (str)

  • name (str)

  • facets (Dict[Any, Any])

namespace: str
name: str
facets: Dict[Any, Any]
class openlineage.client.run.JobEvent(eventTime, producer, schemaURL, job, inputs=_Nothing.NOTHING, outputs=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • eventTime (str)

  • producer (str)

  • schemaURL (str)

  • job (Job)

  • inputs (Optional[List[Dataset]])

  • outputs (Optional[List[Dataset]])

eventTime: str
producer: str
schemaURL: str
job: Job
inputs: Optional[List[Dataset]]
outputs: Optional[List[Dataset]]
class openlineage.client.run.Run(runId, facets=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • runId (str)

  • facets (Dict[Any, Any])

runId: str
facets: Dict[Any, Any]
check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

class openlineage.client.run.RunEvent(eventType, eventTime, run, job, producer, inputs=_Nothing.NOTHING, outputs=_Nothing.NOTHING, schemaURL='https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent')

Bases: RedactMixin

Parameters:
  • eventType (RunState)

  • eventTime (str)

  • run (Run)

  • job (Job)

  • producer (str)

  • inputs (Optional[List[Dataset]])

  • outputs (Optional[List[Dataset]])

  • schemaURL (str)

eventType: RunState
eventTime: str
run: Run
job: Job
producer: str
inputs: Optional[List[Dataset]]
outputs: Optional[List[Dataset]]
schemaURL: str
check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

openlineage.client.serde module

class openlineage.client.serde.Serde

Bases: object

classmethod remove_nulls_and_enums(obj)
Parameters:

obj (Any)

Return type:

Any

classmethod to_dict(obj)
Parameters:

obj (Any)

Return type:

dict[Any, Any]

classmethod to_json(obj)
Parameters:

obj (Any)

Return type:

str

openlineage.client.utils module

openlineage.client.utils.import_from_string(path)
Parameters:

path (str)

Return type:

type[Any]

openlineage.client.utils.try_import_from_string(path)
Parameters:

path (str)

Return type:

type[Any] | None

openlineage.client.utils.get_only_specified_fields(clazz, params)
Parameters:
  • clazz (type[Any])

  • params (dict[str, Any])

Return type:

dict[str, Any]

class openlineage.client.utils.RedactMixin

Bases: object

property skip_redact: list[str]
openlineage.client.utils.load_config()
Return type:

dict[str, Any]

openlineage.client.generated.base module

openlineage.client.generated.base.set_producer(producer)
Parameters:

producer (str)

Return type:

None

class openlineage.client.generated.base.BaseEvent(*, eventTime, producer='')

Bases: RedactMixin

Parameters:
  • eventTime (str)

  • producer (str)

eventTime: str

the time the event occurred at

producer: str
schemaURL: str
property skip_redact
eventtime_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

producer_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

schemaurl_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

class openlineage.client.generated.base.BaseFacet(*, producer='')

Bases: RedactMixin

all fields of the base facet are prefixed with _ to avoid name conflicts in facets

Parameters:

producer (str)

property skip_redact
class openlineage.client.generated.base.Dataset(namespace, name, *, facets=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • namespace (str)

  • name (str)

  • facets (dict[str, DatasetFacet] | None)

namespace: str

The namespace containing that dataset

name: str

The unique name for that dataset within that namespace

facets: dict[str, DatasetFacet] | None

The facets for this dataset

class openlineage.client.generated.base.DatasetEvent(*, eventTime, producer='', dataset)

Bases: BaseEvent

Parameters:
  • eventTime (str)

  • producer (str)

  • dataset (StaticDataset)

dataset: StaticDataset
class openlineage.client.generated.base.DatasetFacet(*, producer='', deleted=None)

Bases: BaseFacet

A Dataset Facet

Parameters:
  • producer (str)

  • deleted (bool | None)

class openlineage.client.generated.base.EventType(value)

Bases: Enum

the current transition of the run state. It is required to issue 1 START event and 1 of [ COMPLETE, ABORT, FAIL ] event per run. Additional events with OTHER eventType can be added to the same run. For example to send additional metadata after the run is complete

START = 'START'
RUNNING = 'RUNNING'
COMPLETE = 'COMPLETE'
ABORT = 'ABORT'
FAIL = 'FAIL'
OTHER = 'OTHER'
class openlineage.client.generated.base.InputDataset(namespace, name, inputFacets=_Nothing.NOTHING, *, facets=_Nothing.NOTHING)

Bases: Dataset

An input dataset

inputFacets: dict[str, InputDatasetFacet] | None

The input facets for this dataset.

class openlineage.client.generated.base.InputDatasetFacet(*, producer='')

Bases: BaseFacet

An Input Dataset Facet

Parameters:

producer (str)

class openlineage.client.generated.base.Job(namespace, name, facets=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • namespace (str)

  • name (str)

  • facets (dict[str, JobFacet] | None)

namespace: str

The namespace containing that job

name: str

The unique name for that job within that namespace

facets: dict[str, JobFacet] | None

The job facets.

class openlineage.client.generated.base.JobEvent(*, eventTime, producer='', job, inputs=_Nothing.NOTHING, outputs=_Nothing.NOTHING)

Bases: BaseEvent

job: Job
inputs: list[InputDataset] | None

The set of input datasets.

outputs: list[OutputDataset] | None

The set of output datasets.

class openlineage.client.generated.base.JobFacet(*, producer='', deleted=None)

Bases: BaseFacet

A Job Facet

Parameters:
  • producer (str)

  • deleted (bool | None)

class openlineage.client.generated.base.OutputDataset(namespace, name, outputFacets=_Nothing.NOTHING, *, facets=_Nothing.NOTHING)

Bases: Dataset

An output dataset

outputFacets: dict[str, OutputDatasetFacet] | None

The output facets for this dataset

class openlineage.client.generated.base.OutputDatasetFacet(*, producer='')

Bases: BaseFacet

An Output Dataset Facet

Parameters:

producer (str)

class openlineage.client.generated.base.Run(runId, facets=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • runId (str)

  • facets (dict[str, RunFacet] | None)

runId: str

The globally unique ID of the run associated with the job.

facets: dict[str, RunFacet] | None

The run facets.

runid_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

class openlineage.client.generated.base.RunEvent(*, eventTime, producer='', run, job, eventType=None, inputs=_Nothing.NOTHING, outputs=_Nothing.NOTHING)

Bases: BaseEvent

Parameters:
  • eventTime (str)

  • producer (str)

  • run (Run)

  • job (Job)

  • eventType (EventType | None)

  • inputs (list[InputDataset] | None)

  • outputs (list[OutputDataset] | None)

run: Run
job: Job
eventType: EventType | None

the current transition of the run state. It is required to issue 1 START event and 1 of [ COMPLETE, ABORT, FAIL ] event per run. Additional events with OTHER eventType can be added to the same run. For example to send additional metadata after the run is complete

inputs: list[InputDataset] | None

The set of input datasets.

outputs: list[OutputDataset] | None

The set of output datasets.

class openlineage.client.generated.base.RunFacet(*, producer='')

Bases: BaseFacet

A Run Facet

Parameters:

producer (str)

class openlineage.client.generated.base.StaticDataset(namespace, name, *, facets=_Nothing.NOTHING)

Bases: Dataset

A Dataset sent within static metadata events

Parameters:
  • namespace (str)

  • name (str)

  • facets (dict[str, DatasetFacet] | None)

openlineage.client.generated.column_lineage_dataset module

class openlineage.client.generated.column_lineage_dataset.ColumnLineageDatasetFacet(fields, *, producer='', deleted=None)

Bases: DatasetFacet

Parameters:
  • fields (dict[str, Fields])

  • producer (str)

  • deleted (bool | None)

fields: dict[str, Fields]

Column level lineage that maps output fields into input fields used to evaluate them.

class openlineage.client.generated.column_lineage_dataset.Fields(inputFields, transformationDescription=None, transformationType=None)

Bases: RedactMixin

Parameters:
  • inputFields (list[InputField])

  • transformationDescription (str | None)

  • transformationType (str | None)

inputFields: list[InputField]
transformationDescription: str | None

a string representation of the transformation applied

transformationType: str | None

no original data available (like a hash of PII for example)

Type:

IDENTITY|MASKED reflects a clearly defined behavior. IDENTITY

Type:

exact same as input; MASKED

class openlineage.client.generated.column_lineage_dataset.InputField(namespace, name, field)

Bases: RedactMixin

Parameters:
  • namespace (str)

  • name (str)

  • field (str)

namespace: str

The input dataset namespace

name: str

The input dataset name

field: str

The input field

openlineage.client.generated.data_quality_assertions_dataset module

class openlineage.client.generated.data_quality_assertions_dataset.Assertion(assertion, success, column=None)

Bases: RedactMixin

Parameters:
  • assertion (str)

  • success (bool)

  • column (str | None)

assertion: str

Type of expectation test that dataset is subjected to

success: bool
column: str | None

Column that expectation is testing. It should match the name provided in SchemaDatasetFacet. If column field is empty, then expectation refers to whole dataset.

class openlineage.client.generated.data_quality_assertions_dataset.DataQualityAssertionsDatasetFacet(assertions, *, producer='')

Bases: InputDatasetFacet

list of tests performed on dataset or dataset columns, and their results

Parameters:
  • assertions (list[Assertion])

  • producer (str)

assertions: list[Assertion]

openlineage.client.generated.data_quality_metrics_input_dataset module

class openlineage.client.generated.data_quality_metrics_input_dataset.ColumnMetrics(nullCount=None, distinctCount=None, sum=None, count=None, min=None, max=None, quantiles=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • nullCount (int | None)

  • distinctCount (int | None)

  • sum (float | None)

  • count (float | None)

  • min (float | None)

  • max (float | None)

  • quantiles (dict[str, float] | None)

nullCount: int | None

The number of null values in this column for the rows evaluated

distinctCount: int | None

The number of distinct values in this column for the rows evaluated

sum: float | None

The total sum of values in this column for the rows evaluated

count: float | None

The number of values in this column

min: float | None
max: float | None
quantiles: dict[str, float] | None

0.1 0.25 0.5 0.75 1

Type:

The property key is the quantile. Examples

class openlineage.client.generated.data_quality_metrics_input_dataset.DataQualityMetricsInputDatasetFacet(columnMetrics, rowCount=None, bytes=None, fileCount=None, *, producer='')

Bases: InputDatasetFacet

Parameters:
  • columnMetrics (dict[str, ColumnMetrics])

  • rowCount (int | None)

  • bytes (int | None)

  • fileCount (int | None)

  • producer (str)

columnMetrics: dict[str, ColumnMetrics]

The property key is the column name

rowCount: int | None

The number of rows evaluated

bytes: int | None

The size in bytes

fileCount: int | None

The number of files evaluated

openlineage.client.generated.dataset_version_dataset module

class openlineage.client.generated.dataset_version_dataset.DatasetVersionDatasetFacet(datasetVersion, *, producer='', deleted=None)

Bases: DatasetFacet

Parameters:
  • datasetVersion (str)

  • producer (str)

  • deleted (bool | None)

datasetVersion: str

The version of the dataset.

openlineage.client.generated.datasource_dataset module

class openlineage.client.generated.datasource_dataset.DatasourceDatasetFacet(name=None, uri=None, *, producer='', deleted=None)

Bases: DatasetFacet

Parameters:
  • name (str | None)

  • uri (str | None)

  • producer (str)

  • deleted (bool | None)

name: str | None
uri: str | None
uri_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

openlineage.client.generated.documentation_dataset module

class openlineage.client.generated.documentation_dataset.DocumentationDatasetFacet(description, *, producer='', deleted=None)

Bases: DatasetFacet

Parameters:
  • description (str)

  • producer (str)

  • deleted (bool | None)

description: str

The description of the dataset.

openlineage.client.generated.documentation_job module

class openlineage.client.generated.documentation_job.DocumentationJobFacet(description, *, producer='', deleted=None)

Bases: JobFacet

Parameters:
  • description (str)

  • producer (str)

  • deleted (bool | None)

description: str

The description of the job.

openlineage.client.generated.error_message_run module

class openlineage.client.generated.error_message_run.ErrorMessageRunFacet(message, programmingLanguage, stackTrace=None, *, producer='')

Bases: RunFacet

Parameters:
  • message (str)

  • programmingLanguage (str)

  • stackTrace (str | None)

  • producer (str)

message: str

A human-readable string representing error message generated by observed system

programmingLanguage: str

Programming language the observed system uses.

stackTrace: str | None

A language-specific stack trace generated by observed system

openlineage.client.generated.external_query_run module

class openlineage.client.generated.external_query_run.ExternalQueryRunFacet(externalQueryId, source, *, producer='')

Bases: RunFacet

Parameters:
  • externalQueryId (str)

  • source (str)

  • producer (str)

externalQueryId: str

Identifier for the external system

source: str

source of the external query

openlineage.client.generated.extraction_error_run module

class openlineage.client.generated.extraction_error_run.Error(errorMessage, stackTrace=None, task=None, taskNumber=None)

Bases: RedactMixin

Parameters:
  • errorMessage (str)

  • stackTrace (str | None)

  • task (str | None)

  • taskNumber (int | None)

errorMessage: str

Text representation of extraction error message.

stackTrace: str | None

Stack trace of extraction error message

task: str | None

Text representation of task that failed. This can be, for example, SQL statement that parser could not interpret.

taskNumber: int | None

Order of task (counted from 0).

class openlineage.client.generated.extraction_error_run.ExtractionErrorRunFacet(totalTasks, failedTasks, errors, *, producer='')

Bases: RunFacet

Parameters:
  • totalTasks (int)

  • failedTasks (int)

  • errors (list[Error])

  • producer (str)

totalTasks: int

The number of distinguishable tasks in a run that were processed by OpenLineage, whether successfully or not. Those could be, for example, distinct SQL statements.

failedTasks: int

The number of distinguishable tasks in a run that were processed not successfully by OpenLineage. Those could be, for example, distinct SQL statements.

errors: list[Error]

openlineage.client.generated.job_type_job module

class openlineage.client.generated.job_type_job.JobTypeJobFacet(processingType, integration, jobType=None, *, producer='', deleted=None)

Bases: JobFacet

Parameters:
  • processingType (str)

  • integration (str)

  • jobType (str | None)

  • producer (str)

  • deleted (bool | None)

processingType: str

BATCH or STREAMING

Type:

Job processing type like

integration: str

SPARK|DBT|AIRFLOW|FLINK

Type:

OpenLineage integration type of this job

jobType: str | None

QUERY|COMMAND|DAG|TASK|JOB|MODEL

Type:

Run type like

openlineage.client.generated.lifecycle_state_change_dataset module

class openlineage.client.generated.lifecycle_state_change_dataset.LifecycleStateChange(value)

Bases: Enum

The lifecycle state change.

ALTER = 'ALTER'
CREATE = 'CREATE'
DROP = 'DROP'
OVERWRITE = 'OVERWRITE'
RENAME = 'RENAME'
TRUNCATE = 'TRUNCATE'
class openlineage.client.generated.lifecycle_state_change_dataset.LifecycleStateChangeDatasetFacet(lifecycleStateChange, previousIdentifier=None, *, producer='', deleted=None)

Bases: DatasetFacet

Parameters:
  • lifecycleStateChange (LifecycleStateChange)

  • previousIdentifier (PreviousIdentifier | None)

  • producer (str)

  • deleted (bool | None)

lifecycleStateChange: LifecycleStateChange

The lifecycle state change.

previousIdentifier: PreviousIdentifier | None

Previous name of the dataset in case of renaming it.

class openlineage.client.generated.lifecycle_state_change_dataset.PreviousIdentifier(name, namespace)

Bases: RedactMixin

Previous name of the dataset in case of renaming it.

Parameters:
  • name (str)

  • namespace (str)

name: str
namespace: str

openlineage.client.generated.nominal_time_run module

class openlineage.client.generated.nominal_time_run.NominalTimeRunFacet(nominalStartTime, nominalEndTime=None, *, producer='')

Bases: RunFacet

Parameters:
  • nominalStartTime (str)

  • nominalEndTime (str | None)

  • producer (str)

nominalStartTime: str

//en.wikipedia.org/wiki/ISO_8601) timestamp representing the nominal start time (included) of the run. AKA the schedule time

Type:

An [ISO-8601](https

nominalEndTime: str | None

//en.wikipedia.org/wiki/ISO_8601) timestamp representing the nominal end time (excluded) of the run. (Should be the nominal start time of the next run)

Type:

An [ISO-8601](https

nominalstarttime_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

nominalendtime_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

openlineage.client.generated.output_statistics_output_dataset module

class openlineage.client.generated.output_statistics_output_dataset.OutputStatisticsOutputDatasetFacet(rowCount=None, size=None, fileCount=None, *, producer='')

Bases: OutputDatasetFacet

Parameters:
  • rowCount (int | None)

  • size (int | None)

  • fileCount (int | None)

  • producer (str)

rowCount: int | None

The number of rows written to the dataset

size: int | None

The size in bytes written to the dataset

fileCount: int | None

The number of files written to the dataset

openlineage.client.generated.ownership_dataset module

class openlineage.client.generated.ownership_dataset.Owner(name, type=None)

Bases: RedactMixin

Parameters:
  • name (str)

  • type (str | None)

name: str

the identifier of the owner of the Dataset. It is recommended to define this as a URN. For example application:foo, user:jdoe, team:data

type: str | None

The type of ownership (optional)

class openlineage.client.generated.ownership_dataset.OwnershipDatasetFacet(owners=_Nothing.NOTHING, *, producer='', deleted=None)

Bases: DatasetFacet

Parameters:
  • owners (list[Owner] | None)

  • producer (str)

  • deleted (bool | None)

owners: list[Owner] | None

The owners of the dataset.

openlineage.client.generated.ownership_job module

class openlineage.client.generated.ownership_job.Owner(name, type=None)

Bases: RedactMixin

Parameters:
  • name (str)

  • type (str | None)

name: str

the identifier of the owner of the Job. It is recommended to define this as a URN. For example application:foo, user:jdoe, team:data

type: str | None

The type of ownership (optional)

class openlineage.client.generated.ownership_job.OwnershipJobFacet(owners=_Nothing.NOTHING, *, producer='', deleted=None)

Bases: JobFacet

Parameters:
  • owners (list[Owner] | None)

  • producer (str)

  • deleted (bool | None)

owners: list[Owner] | None

The owners of the job.

openlineage.client.generated.parent_run module

class openlineage.client.generated.parent_run.Job(namespace, name)

Bases: RedactMixin

Parameters:
  • namespace (str)

  • name (str)

namespace: str

The namespace containing that job

name: str

The unique name for that job within that namespace

class openlineage.client.generated.parent_run.ParentRunFacet(run, job, *, producer='')

Bases: RunFacet

the id of the parent run and job, iff this run was spawn from an other run (for example, the Dag run scheduling its tasks)

Parameters:
  • run (Run)

  • job (Job)

  • producer (str)

run: Run
job: Job
classmethod create(runId, namespace, name)
Parameters:
  • runId (str)

  • namespace (str)

  • name (str)

Return type:

ParentRunFacet

class openlineage.client.generated.parent_run.Run(runId)

Bases: RedactMixin

Parameters:

runId (str)

runId: str

The globally unique ID of the run associated with the job.

runid_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

openlineage.client.generated.processing_engine_run module

class openlineage.client.generated.processing_engine_run.ProcessingEngineRunFacet(version, name=None, openlineageAdapterVersion=None, *, producer='')

Bases: RunFacet

Parameters:
  • version (str)

  • name (str | None)

  • openlineageAdapterVersion (str | None)

  • producer (str)

version: str

Processing engine version. Might be Airflow or Spark version.

name: str | None

Processing engine name, e.g. Airflow or Spark

openlineageAdapterVersion: str | None

OpenLineage adapter package version. Might be e.g. OpenLineage Airflow integration package version

openlineage.client.generated.schema_dataset module

class openlineage.client.generated.schema_dataset.SchemaDatasetFacet(fields=_Nothing.NOTHING, *, producer='', deleted=None)

Bases: DatasetFacet

Parameters:
  • fields (list[SchemaDatasetFacetFields] | None)

  • producer (str)

  • deleted (bool | None)

fields: list[SchemaDatasetFacetFields] | None

The fields of the data source.

class openlineage.client.generated.schema_dataset.SchemaDatasetFacetFields(name, type=None, description=None, fields=_Nothing.NOTHING)

Bases: RedactMixin

Parameters:
  • name (str)

  • type (str | None)

  • description (str | None)

  • fields (list[SchemaDatasetFacetFields] | None)

name: str

The name of the field.

type: str | None

The type of the field.

description: str | None

The description of the field.

fields: list[SchemaDatasetFacetFields] | None

Nested struct fields.

openlineage.client.generated.source_code_job module

class openlineage.client.generated.source_code_job.SourceCodeJobFacet(language, sourceCode, *, producer='', deleted=None)

Bases: JobFacet

Parameters:
  • language (str)

  • sourceCode (str)

  • producer (str)

  • deleted (bool | None)

language: str

Language in which source code of this job was written.

sourceCode: str

Source code of this job.

openlineage.client.generated.source_code_location_job module

class openlineage.client.generated.source_code_location_job.SourceCodeLocationJobFacet(type, url, repoUrl=None, path=None, version=None, tag=None, branch=None, *, producer='', deleted=None)

Bases: JobFacet

Parameters:
  • type (str)

  • url (str)

  • repoUrl (str | None)

  • path (str | None)

  • version (str | None)

  • tag (str | None)

  • branch (str | None)

  • producer (str)

  • deleted (bool | None)

type: str

the source control system

url: str

the full http URL to locate the file

repoUrl: str | None

the URL to the repository

path: str | None

the path in the repo containing the source files

version: str | None

the current version deployed (not a branch name, the actual unique version)

tag: str | None

optional tag name

branch: str | None

optional branch name

url_check(attribute, value)
Parameters:
  • attribute (str)

  • value (str)

Return type:

None

openlineage.client.generated.sql_job module

class openlineage.client.generated.sql_job.SQLJobFacet(query, *, producer='', deleted=None)

Bases: JobFacet

Parameters:
  • query (str)

  • producer (str)

  • deleted (bool | None)

query: str

openlineage.client.generated.storage_dataset module

class openlineage.client.generated.storage_dataset.StorageDatasetFacet(storageLayer, fileFormat=None, *, producer='', deleted=None)

Bases: DatasetFacet

Parameters:
  • storageLayer (str)

  • fileFormat (str | None)

  • producer (str)

  • deleted (bool | None)

storageLayer: str

iceberg, delta.

Type:

Storage layer provider with allowed values

fileFormat: str | None

parquet, orc, avro, json, csv, text, xml.

Type:

File format with allowed values

openlineage.client.transport.console module

class openlineage.client.transport.console.ConsoleConfig

Bases: Config

class openlineage.client.transport.console.ConsoleTransport(config)

Bases: Transport

Parameters:

config (ConsoleConfig)

kind: str | None = 'console'
config_class

alias of ConsoleConfig

emit(event)
Parameters:

event (Union[RunEvent, DatasetEvent, JobEvent, RunEvent, DatasetEvent, JobEvent])

Return type:

None

openlineage.client.transport.factory module

class openlineage.client.transport.factory.DefaultTransportFactory

Bases: TransportFactory

register_transport(of_type, clazz)
Parameters:
  • of_type (str)

  • clazz (type[Transport] | str)

Return type:

None

create(config=None)
Parameters:

config (dict[str, str] | None)

Return type:

Transport

openlineage.client.transport.file module

class openlineage.client.transport.file.FileConfig(log_file_path, append=False)

Bases: Config

Parameters:
  • log_file_path (str)

  • append (bool)

log_file_path: str
append: bool = False
classmethod from_dict(params)
Parameters:

params (dict[str, Any])

Return type:

FileConfig

class openlineage.client.transport.file.FileTransport(config)

Bases: Transport

Parameters:

config (FileConfig)

kind: str | None = 'file'
config_class

alias of FileConfig

emit(event)
Parameters:

event (Union[RunEvent, DatasetEvent, JobEvent, RunEvent, DatasetEvent, JobEvent])

Return type:

None

openlineage.client.transport.http module

class openlineage.client.transport.http.TokenProvider(config)

Bases: object

Parameters:

config (dict[str, str])

get_bearer()
Return type:

str | None

class openlineage.client.transport.http.HttpCompression(value)

Bases: Enum

An enumeration.

GZIP = 'gzip'
class openlineage.client.transport.http.ApiKeyTokenProvider(config)

Bases: TokenProvider

Parameters:

config (dict[str, str])

get_bearer()
Return type:

str | None

openlineage.client.transport.http.create_token_provider(auth)
Parameters:

auth (dict[str, str])

Return type:

TokenProvider

openlineage.client.transport.http.get_session()
Return type:

Session

class openlineage.client.transport.http.HttpConfig(url, endpoint='api/v1/lineage', timeout=5.0, verify=True, auth=_Nothing.NOTHING, compression=None, session=None, adapter=None)

Bases: Config

Parameters:
  • url (str)

  • endpoint (str)

  • timeout (float)

  • verify (bool)

  • auth (TokenProvider)

  • compression (HttpCompression | None)

  • session (Session | None)

  • adapter (HTTPAdapter | None)

url: str
endpoint: str
timeout: float
verify: bool
auth: TokenProvider
compression: HttpCompression | None
session: Session | None
adapter: HTTPAdapter | None
classmethod from_dict(params)
Parameters:

params (dict[str, Any])

Return type:

HttpConfig

classmethod from_options(url, options, session)
Parameters:
  • url (str)

  • options (OpenLineageClientOptions)

  • session (Session | None)

Return type:

HttpConfig

class openlineage.client.transport.http.HttpTransport(config)

Bases: Transport

Parameters:

config (HttpConfig)

kind: str | None = 'http'
config_class

alias of HttpConfig

set_adapter(adapter)
Parameters:

adapter (HTTPAdapter)

Return type:

None

emit(event)
Parameters:

event (Union[RunEvent, DatasetEvent, JobEvent, RunEvent, DatasetEvent, JobEvent])

Return type:

Response

openlineage.client.transport.kafka module

class openlineage.client.transport.kafka.KafkaConfig(config, topic, messageKey=None, flush=True)

Bases: Config

Parameters:
  • config (dict[str, str])

  • topic (str)

  • messageKey (str | None)

  • flush (bool)

config: dict[str, str]
topic: str
messageKey: str | None
flush: bool
classmethod from_dict(params)
Parameters:

params (dict[str, Any])

Return type:

_T

openlineage.client.transport.kafka.on_delivery(err, msg)
Parameters:
  • err (KafkaError)

  • msg (Message)

Return type:

None

class openlineage.client.transport.kafka.KafkaTransport(config)

Bases: Transport

Parameters:

config (KafkaConfig)

kind: str | None = 'kafka'
config_class

alias of KafkaConfig

emit(event)
Parameters:

event (Event)

Return type:

None

openlineage.client.transport.msk_iam module

class openlineage.client.transport.msk_iam.MSKIAMConfig(config, topic, messageKey=None, flush=True, region=None, aws_profile=None, role_arn=None, aws_debug_creds=False)

Bases: KafkaConfig

Parameters:
  • config (dict[str, str])

  • topic (str)

  • messageKey (str | None)

  • flush (bool)

  • region (str)

  • aws_profile (None | str)

  • role_arn (None | str)

  • aws_debug_creds (bool)

region: str
aws_profile: None | str
role_arn: None | str
aws_debug_creds: bool
class openlineage.client.transport.msk_iam.MSKIAMTransport(config)

Bases: KafkaTransport

Parameters:

config (MSKIAMConfig)

kind: str | None = 'msk-iam'
config_class

alias of MSKIAMConfig

openlineage.client.transport.noop module

class openlineage.client.transport.noop.NoopConfig

Bases: Config

class openlineage.client.transport.noop.NoopTransport(config)

Bases: Transport

Parameters:

config (NoopConfig)

kind: str | None = 'noop'
config_class

alias of NoopConfig

emit(event)
Parameters:

event (Union[RunEvent, DatasetEvent, JobEvent, RunEvent, DatasetEvent, JobEvent])

Return type:

None

openlineage.client.transport.transport module

To implement custom Transport implement Config and Transport classes.

Transport needs to
  • specify class variable config that will point to Config class that Transport requires

  • __init__ that will accept specified Config class instance

  • implement emit method that will accept RunEvent

Config file is read and parameters there are passed to from_dict classmethod. The config class can have more complex attributes, but needs to be able to instantiate them in from_dict method.

DefaultTransportFactory instantiates custom transports by looking at type field in class config.

class openlineage.client.transport.transport.Config

Bases: object

classmethod from_dict(params)
Parameters:

params (dict[str, Any])

Return type:

_T

class openlineage.client.transport.transport.Transport

Bases: object

kind: str | None = None
config_class

alias of Config

emit(event)
Parameters:

event (Union[RunEvent, DatasetEvent, JobEvent, RunEvent, DatasetEvent, JobEvent])

Return type:

Any

class openlineage.client.transport.transport.TransportFactory

Bases: object

create(config=None)
Parameters:

config (dict[str, str] | None)

Return type:

Transport