Enable Trace Propagation#65
Conversation
When the configuration parameter `EnableTracePropagation` is enabled, the traces generated by river workers will be *linked* to the traces that enqueued them. This is done by use of metadata parameters, following the example in River's documentation for the middleware functionality (https://riverqueue.com/docs/middleware). This uses a *link* instead of directly making a child span which is semantically more correct. Fixes riverqueue#41 (I believe, at least).
|
@codex review |
|
Just an FYI: we've rolled this code out in production at my company and it has been successful (traces from jobs are properly linking to whatever kicked them off). |
brandur
left a comment
There was a problem hiding this comment.
Good stuff! Want to add a changelog entry?
| for k, v := range carrier { | ||
| meta[k] = v | ||
| } | ||
| injected, err := json.Marshal(meta) |
There was a problem hiding this comment.
What do you think about using sjson here instead of unmarshaling and then re-marshaling? sjon's already in the dependency graph, so it's not too big of a deal.
func injectTraceContext(ctx context.Context, metadata []byte) []byte {
carrier := make(propagation.MapCarrier)
propagation.TraceContext{}.Inject(ctx, carrier)
if len(carrier) == 0 {
return metadata
}
original := metadata
for k, v := range carrier {
var err error
metadata, err = sjson.SetBytes(metadata, k, v)
if err != nil {
return original
}
}
return metadata
}| } | ||
| } | ||
| extracted := propagation.TraceContext{}.Extract(ctx, carrier) | ||
| return trace.SpanFromContext(extracted).SpanContext() |
There was a problem hiding this comment.
I think this piece is subtly not-quite-right in that the purpose of this function is always to pull a span context out of the given metadata and we should never be looking to get out of active context. What do you think about removing the ctx parameter from extractSpanContext in favor of using context.Background() here? If no span context is founded in metadata, then nothing should be extracted.
|
|
||
| // EnableTracePropagation injects W3C trace context (traceparent/tracestate) | ||
| // into job metadata on insert and extracts it on work, adding a span link | ||
| // from the work span back to the span that enqueued the job. A link is used |
There was a problem hiding this comment.
Do you want to maybe move the sentence starting from "A link is used ..." from here down to this line?
startOpts = append(startOpts, trace.WithLinks(trace.Link{SpanContext: sc}))
It feels like rationalizing the use of a link here is a little bit TMI for the docblock and belongs more in an internal comment.
When the configuration parameter
EnableTracePropagationis enabled, the traces generated by river workers will be linked to the traces that enqueued them. This is done by use of metadata parameters, following the example in River's documentation for the middleware functionality (https://riverqueue.com/docs/middleware).This uses a link instead of directly making a child span which is semantically more correct—if a parent span is used, you get weird situations where a POST request that returns in 50ms somehow contains a trace that starts two hours later and takes 10 minutes. River jobs are async, after all.
Fixes #41 (I believe, at least).