1. Introduction
This section is non-normative.
1.1. Motivation
Browsers are now working to prevent cross-site user tracking, including by partitioning storage and removing third-party cookies. There are a range of API proposals to continue supporting legitimate use cases in a way that respects user privacy. Many of these APIs, including the Shared Storage API and the Protected Audience API, isolate potentially identifying cross-site data in special contexts, which ensures that the data cannot escape the user agent.
Relative to cross-site data from an individual user, aggregate data about groups of users can be less sensitive and yet would be sufficient for a wide range of use cases. An aggregation service has been built to allow reporting noisy, aggregated cross-site data. This service was originally created for use by the Attribution Reporting API, but allowing more general aggregation supports additional use cases. In particular, the Protected Audience and Shared Storage APIs expect this functionality to be available.
1.2. Overview
This document outlines a general-purpose API that can be called from isolated contexts that have access to cross-site data (such as a Shared Storage worklet). Within these contexts, potentially identifying data can be encapsulated into "aggregatable reports". To prevent leakage, the cross-site data in these reports is encrypted to ensure it can only be processed by the aggregation service. During processing, this service adds noise and imposes limits on how many queries can be performed.
This API provides functions that allow the origin to construct an aggregatable report and specify the values to be embedded into its encrypted payload (for later computation via the aggregation service). These calls result in the aggregatable report being queued to be sent to the reporting endpoint of the script’s origin after a delay. After the endpoint receives the reports, it will batch the reports and send them to the aggregation service for processing. The output of that process is a summary report containing the (approximate) result, which is dispatched back to the script’s origin.
1.3. Alternative considered
Instead of the chosen API shape, we considered aligning with a design that is
much closer to fetch()
. However, there are a few
key differences which make this unfavorable:
-
This API is designed to be used in isolated contexts where
fetch()
is not available. -
It’s an anti-goal to give the developer control over when aggregatable reports are being sent or knowledge that they were sent (outside of the isolated context). Note, however, the exception when providing a context ID from outside the isolated context, see Protecting against leaks via the number of reports below.
-
The reports cannot be sent to arbitrary reporting endpoints, only a particular
.well-known
path on the script origin. -
The report’s input is very specific (an array of
PAHistogramContribution
s) and is not amenable tofetch()
’s general purpose contents. -
There is no concept of a response.
So, we chose the more tailored API shape detailed below.
2. Exposed interface
[Exposed =(InterestGroupScriptRunnerGlobalScope ,SharedStorageWorklet ),SecureContext ]interface {
PrivateAggregation undefined contributeToHistogram (PAHistogramContribution );
contribution undefined contributeToHistogramOnEvent (DOMString ,
event record <DOMString ,any >);
contribution undefined enableDebugMode (optional PADebugModeOptions = {}); };
options dictionary {
PAHistogramContribution required bigint ;
bucket required long ;
value bigint = 0; };
filteringId dictionary {
PADebugModeOptions required bigint ; };
debugKey
Per the Web Platform Design Principles, we should consider switching long
to [EnforceRange] long long
.
enableDebugMode(options)
’s argument should not
have a default value of {}
. Alternatively, debugKey
should not be required in PADebugModeOptions
.
Each PrivateAggregation
object has the following fields:
- scoping details (default null)
-
A scoping details or null
- allowed to use (default false)
-
A boolean
- should perform default contributeToHistogramOnEvent() processing (default an algorithm that always returns true)
-
An algorithm that takes a
PrivateAggregation
,DOMString
and a map (withDOMString
keys) and returns either a boolean or an exception.Note: This allows embedding APIs to override processing for all calls or just certain ones. A return value of true indicates that this call will be processed using the default implementation in this spec. A return value that is an exception allows the embedding API’s processing to throw.
Note: See Exposing to global scopes below.
contributeToHistogram(PAHistogramContribution contribution)
method steps
are:
-
Let validationResult be the result of validating a histogram contribution given contribution and this’s scoping details.
-
If validationResult is an exception, throw validationResult.
-
Assert: validationResult is a contribution cache entry.
-
Append validationResult to the contribution cache.
Consider accepting an array of contributions. [Issue #44]
contributeToHistogramOnEvent(event, contribution)
method steps are:
-
Let defaultProcessingResult be the result of running this’s should perform default contributeToHistogramOnEvent() processing given this, event and contribution.
-
If defaultProcessingResult is an exception, throw defaultProcessingResult.
-
If defaultProcessingResult is false, return.
-
Set contribution to the result of converting contribution’s JavaScript value to the IDL type
PAHistogramContribution
.Note: This throws a
TypeError
if contribution is not compatible. -
Let validationResult be the result of validating a histogram contribution given contribution and this’s scoping details.
-
If validationResult is an exception, throw validationResult.
-
Assert: validationResult is a contribution cache entry.
-
If event does not start with "
reserved.
", throw aTypeError
. -
Let unprefixedEvent be the code unit substring from "
reserved.
"'s length to event’s length within event. -
If unprefixedEvent is any of the internal error events:
-
Let maybeContributionCacheEntry’s error event be unprefixedEvent.
-
Append maybeContributionCacheEntry to the contribution cache.
-
Note: For forward compatibility, we do not throw an error for unrecognized
events that start with "reserved.
".
enableDebugMode(optional PADebugModeOptions options)
method steps are:
-
Let scopingDetails be this’s scoping details.
-
Let debugScope be the result of running scopingDetails’ get debug scope steps.
-
If debug scope map[debugScope] exists, throw a "
DataError
"DOMException
.Note: This would occur if
enableDebugMode()
has already been run for this debug scope. -
Let debugKey be null.
-
If options was given:
-
Let debugDetails be a new debug details with the items:
-
Optionally, set debugDetails to a new debug details.
Note: This allows the user agent to make debug mode unavailable globally or just for certain callers.
-
Set debug scope map[debugScope] to debugDetails.
Ensure errors are of an appropriate type, e.g. InvalidAccessError
is
deprecated.
3. Exposing to global scopes
To expose this API to a global scope, a read only attribute privateAggregation
of type PrivateAggregation
should be exposed on the
global scope. Its getter steps should be set to the get the privateAggregation steps given this.
Each global scope should set the allowed to use for the PrivateAggregation
object it exposes based on whether a relevant document is allowed to use the "private-aggregation
" policy-controlled feature.
Additionally, each global scope should set the scoping details for the PrivateAggregation
object it exposes to a non-null value.
The global scope should wait to set the field until the API is intended to be
available.
For any batching scope returned by the get batching scope steps, the process contributions for a batching scope steps should later be performed given that same batching scope, the global scope’s relevant settings object’s origin, some context type and a timeout (or null).
Note: This last requirement means that global scopes with different origins cannot share the same batching scope, see Same-origin policy discussion.
For any debug scope returned by the get debug scope steps, the mark a debug scope complete steps should later be performed given that same debug scope.
Note: A later algorithm asserts that, for any contribution cache entry in the contribution cache, the mark a debug scope complete steps were performed given the entry’s debug scope before the process contributions for a batching scope steps are performed given the entry’s batching scope.
3.1. Overriding contributeToHistogramOnEvent()
processing
Each API may also set the should perform default contributeToHistogramOnEvent() processing algorithm for the PrivateAggregation
object it exposes. This hook enables the embedding API to
override processing for any contributeToHistogramOnEvent()
call it wants to.
This algorithm should return true to indicate that the regular processing defined in this spec should occur (for that call). It should return an exception to indicate an error has occurred in the embedding API’s processing. Alternatively, it should return false to indicate that the embedding API will process the call, but no exception should be thrown.
If the embedding API overrides processing for a call (i.e. the algorithm does
not return true), it should accept any event
that is a concatenation of « "reserved.
", errorEvent » where errorEvent is an internal error event and should accept any contribution
with a JavaScript value that is convertible to the IDL type PAHistogramContribution
. That is, it
should not return an exception in those cases. However, the embedding API
may accept additional event
s or contribution
s that would not be accepted by
the default processing defined in this spec.
If the embedding API overrides processing for a call specifying a contribution conditional on an internal error event, the embedding API should call append an entry to the contribution cache with the appropriate entry unless it intends to drop that contribution.
If the embedding API overrides processing for a call to support additional error
events, it must wait until the associated condition has been determined to have
occurred or not. If the custom error event was triggered, it should use already triggered external error
as the error event for the associated contributions. If the custom error event
was not triggered, the contributions should be dropped and no call to append an entry to the contribution cache should be made for those contributions.
Note also that, if the embedding API overrides processing, it must convert contribution
to the IDL type PAHistogramContribution
before it is able to
call append an entry to the contribution cache. This applies whether the contribution
has a JavaScript value that
is automatically convertible to the IDL type or
not.
3.2. APIs exposing Private Aggregation
This section is non-normative.
This API is currently exposed in global scopes defined in the specifications of two APIs:
4. Structures
4.1. Batching scope
A batching scope is a unique internal value that identifies whichPAHistogramContribution
s should be sent in the same aggregatable report unless their debug details differ.
Unique internal value is not an exported definition. See infra/583.
4.2. Debug scope
A debug scope is a unique internal value that identifies whichPAHistogramContribution
s should have their debug details affected by the
presence or absence of a call to enableDebugMode()
in the
same period of execution.
4.3. Scoping details
A scoping details is a struct with the following items:- get batching scope steps
-
An algorithm returning a batching scope
- get debug scope steps
-
An algorithm returning a debug scope
4.4. Debug details
A debug details is a struct with the following items:- enabled (default false)
-
A boolean
- key (default null)
-
An unsigned 64-bit integer or null. The key must be null if enabled is false.
4.5. Error events
An internal error event is one of the following:- "
report-success
" -
The report was scheduled and no contributions were dropped.
- "
too-many-contributions
" -
The report was scheduled, but some contributions were dropped due to the per-report limit.
- "
empty-report-dropped
" -
The report was not scheduled as it had no contributions.
- "
pending-report-limit-reached
" -
The report was scheduled, but the limit of pending reports was reached. That is, attempting to schedule one more report would fail due to the limit.
- "
insufficient-budget
" -
One or more contributions in the report were dropped (or the whole report was) as there was not enough contribution budget.
- "
contribution-timeout-reached
" -
The context(s) associated with the report was still running when the contribution timeout occurred.
An error event is an internal error event or the special
value already triggered external error
.
Note: This special value represents any external error event that has already occurred. External error events are defined by embedding APIs and are distinct from internal error events. See contributeToHistogramOnEvent() for more details.
All error events is a list of error events that consists of
all the internal error events in the order they are defined above followed
by already triggered external error
.
4.6. Contribution cache entry
A contribution cache entry is a struct with the following items:- contribution
- error event (default null)
-
An error event or null.
Note: This indicates which error event the contribution is conditional on, or null if the contribution is unconditional.
- batching scope
- debug scope
- debug details (default null)
-
A debug details or null
4.7. Pending contributions
A pending contributions is a struct with the following items:
- unconditional contributions (default: a new list)
-
A list of
PAHistogramContribution
s - conditional contributions (default: a new map)
-
A map where its keys are error events and values are lists of
PAHistogramContribution
s - triggered error events (default: a new set)
-
A set of internal error events
4.8. Aggregatable report
An aggregatable report is a struct with the following items:
- reporting origin
-
An origin
- original report time
-
A moment
- report time
-
A moment
- contributions
-
A list of
PAHistogramContribution
s - api
- report ID
-
A string
- debug details
- aggregation coordinator
- context ID
-
A string or null
- filtering ID max bytes
-
A positive integer
- max contributions
-
A positive integer
- queued
-
A boolean
4.9. Aggregation coordinator
An aggregation coordinator is an origin that the allowed aggregation coordinator set contains.
Consider switching to the suitable origin concept used by the Attribution Reporting API here and elsewhere.
Move other structures to be defined inline instead of via a header. Consider also removing all the subheadings.
4.10. Context type
A context type is a string indicating what kind of global scope thePrivateAggregation
object was exposed in. Each API exposing Private
Aggregation should pick a unique string (or multiple) for this.
4.11. Pre-specified report parameters
A pre-specified report parameters is a struct with the following items:
- context ID (default: null)
-
A string or null
- filtering ID max bytes (default: default filtering ID max bytes)
-
A positive integer
- max contributions (default: null)
-
A positive integer or null
5. Storage
A user agent holds an aggregatable report cache, which is a list of aggregatable reports.
A user agent holds an aggregation coordinator map, which is a map from batching scopes to aggregation coordinators.
A user agent holds a pre-specified report parameters map, which is a map from batching scopes to pre-specified report parameters.
A user agent holds a contribution cache, which is a list of contribution cache entries.
A user agent holds a debug scope map, which is a map from debug scopes to debug details.
Elsewhere, link to definition when using user agent.
5.1. Clearing storage
The user agent must expose controls that allow the user to delete data from the aggregatable report cache as well as any contribution history data stored for the query the budget algorithm.
The user agent may expose controls that allow the user to delete data from the contribution cache, the debug scope map and the pre-specified report parameters map.
6. Constants
Default filtering ID max bytes is a positive integer controlling the max bytes used if none is explicitly chosen. Its value is 1.
Valid filtering ID max bytes range is a set of positive integers controlling the allowable values of max bytes. Its value is the range 1 to 8, inclusive.
Consider adding more constants.
7. Implementation-defined values
Allowed aggregation coordinator set is a set of origins that controls which origins are valid aggregation coordinators. Every item in this set must be a potentially trustworthy origin.
Default aggregation coordinator is an aggregation coordinator that controls which is used for a report if none is explicitly selected.
Maximum maxContributions is a positive integer that defines an upper bound on the number of contributions per aggregatable report.
Default maxContributions by API is a map from context types to positive integers. Semantically, it defines the default number of contributions per report for every kind of calling context, e.g. Shared Storage. The values in this map are used when callers do not specifically request another value. Each value in this map must be less than or equal to maximum maxContributions.
Minimum report delay is a non-negative duration that controls the minimum delay to deliver an aggregatable report.
Randomized report delay is a positive duration that controls the random delay to deliver an aggregatable report. This delay is additional to the minimum report delay.
8. Permissions Policy integration
This specification defines a policy-controlled feature identified by the
string "private-aggregation
". Its default allowlist is "*
".
Note: The allowed to use field is set by other specifications that integrate with this API according to this policy-controlled feature.
9. Algorithms
To serialize an integer, represent it as a string of the shortest possible decimal number.
This would ideally be replaced by a more descriptive algorithm in Infra. See infra/201.
9.1. Exported algorithms
Note: These algorithms allow other specifications to integrate with this API.
PrivateAggregation
this:
-
Let scopingDetails be this’s scoping details.
-
If scopingDetails is null, throw a "
NotAllowedError
"DOMException
.Note: This indicates the API is not yet available, for example, because the initial execution of the script after loading is not complete.
Consider improving developer ergonomics here (e.g. a way to detect this case).
-
If this’s allowed to use is false, throw an "
InvalidAccessError
"DOMException
. -
Return this.
Ensure errors are of an appropriate type, e.g. InvalidAccessError
is
deprecated.
-
Append entry to the contribution cache.
-
If debug scope map[debugScope] exists, return debug scope map[debugScope].
-
Otherwise, return a new debug details.
-
Let debugDetails be debugDetailsOverride.
-
If debug scope map[debugScope] exists:
-
Assert: debugDetailsOverride is null.
Note: The override can be provided if the debug details have not been set otherwise.
-
Set debugDetails to debug scope map[debugScope].
-
Remove debug scope map[debugScope].
-
If debugDetails’s key is not null, assert: debugDetails’s enabled is true.
-
-
If debugDetails is null, set debugDetails to a new debug details.
-
For each entry of the contribution cache:
-
If entry’s debug scope is debugScope, set entry’s debug details to debugDetails.
-
-
If preSpecifiedParams’ context ID is not null, return true.
-
If preSpecifiedParams’ filtering ID max bytes is not the default filtering ID max bytes, return true.
-
Let effectiveMaxContributions be the result of determining the max contributions with api and preSpecifiedParams’ max contributions.
-
Let defaultMaxContributions be default maxContributions by API[api].
-
If effectiveMaxContributions is not defaultMaxContributions, return true.
-
Return false.
Note: It is sometimes necessary to send a 'null report' to conceal the fact that there were no contributions. For instance, it’s possible that budget, which is cross-site data in its own right, was insufficient for the requested contributions. Alternatively, the caller might have chosen to make no contributions after reading cross-site data. In these kinds of scenarios, the absence of a report could reveal cross-site data to the reporting endpoint. See Protecting against leaks via the number of reports.
Note: Embedding APIs are expected to set timeout to null if a report isn’t deterministic or if the timeout was already reached, i.e. if it caused this call to be triggered. (A timeout always has to be set for deterministic reports.)
-
Let batchEntries be a new list.
-
For each entry of the contribution cache:
-
If entry’s batching scope is batchingScope:
-
Assert: entry’s debug details is not null.
Note: This asserts that the mark a debug scope complete steps were run before the process contributions for a batching scope steps.
-
Append entry to batchEntries.
-
-
-
Let aggregationCoordinator be the default aggregation coordinator.
-
If aggregation coordinator map[batchingScope] exists:
-
Set aggregationCoordinator to aggregation coordinator map[batchingScope].
-
Remove aggregation coordinator map[batchingScope].
-
-
Let preSpecifiedParams be a new pre-specified report parameters.
-
If pre-specified report parameters map[batchingScope] exists:
-
Set preSpecifiedParams to pre-specified report parameters map[batchingScope].
-
Remove pre-specified report parameters map[batchingScope].
-
-
Let isDeterministicReport be the result of determining if a report should be sent deterministically given preSpecifiedParams and contextType.
-
If isDeterministicReport is false, assert: timeout is null.
Note: Timeouts can only be used for deterministic reports.
-
If batchEntries is empty and isDeterministicReport is false, return.
-
Let batchedContributions be a new ordered map.
-
For each entry of batchEntries:
-
Remove entry from the contribution cache.
-
Let debugDetails be entry’s debug details.
-
If batchedContributions[debugDetails] does not exist, set batchedContributions[debugDetails] to a new pending contributions.
-
If entry’s error event is null:
-
Append entry’s contribution to batchedContributions[debugDetails]'s unconditional contributions.
-
-
Otherwise:
-
Let conditionalContributions be batchedContributions[debugDetails]'s conditional contributions.
-
If conditionalContributions[errorEvent] does not exist, set conditionalContributions[errorEvent] to a new list.
-
Append entry’s contribution to conditionalContributions[errorEvent].
-
-
-
If batchedContributions is empty:
-
Let debugDetails be a new debug details.
-
Set batchedContributions[debugDetails] to a new pending contributions.
-
-
For each debugDetails → pendingContributions of batchedContributions:
-
Perform the report creation and scheduling steps with reportingOrigin, contextType, pendingContributions, debugDetails, aggregationCoordinator, preSpecifiedParams and timeout.
-
Note: These steps break up the contributions based on their debug details as each report can only have one set of metadata.
-
Return whether origin is an aggregation coordinator.
USVString
originString, perform the following steps. They return an aggregation coordinator or a DOMException
.
-
Let url be the result of running the URL parser on originString.
-
If url is failure or null, return a new
DOMException
with name "SyntaxError
". -
Let origin be url’s origin.
-
If the result of determining if an origin is an aggregation coordinator given origin is false, return a new
DOMException
with name "DataError
". -
Return origin.
-
Assert: origin is an aggregation coordinator.
-
Set aggregation coordinator map[batchingScope] to origin.
Elsewhere, surround algorithms in a <div algorithm>
block to match, and
add styling for all algorithms per bikeshed/1472.
-
Let contextId be params’ context ID.
-
Assert: contextId is null or contextId’s length is not larger than 64.
-
Let filteringIdMaxBytes be params’ filtering ID max bytes.
-
Assert: filteringIdMaxBytes is contained in the valid filtering ID max bytes range
-
Let maxContributions be params’ max contributions.
-
Assert: maxContributions is null or greater than zero.
-
Set pre-specified report parameters map[batchingScope] to params.
PAHistogramContribution
contribution and a scoping details scopingDetails, perform the following steps. They return a contribution cache entry or an exception.
-
If contribution["
bucket
"] is not contained in the range 0 to 2128, exclusive, return aRangeError
. -
If contribution["
value
"] is negative, return aRangeError
. -
Let batchingScope be the result of running scopingDetails’ get batching scope steps.
-
Let filteringIdMaxBytes be the default filtering ID max bytes.
-
If pre-specified report parameters map[batchingScope] exists:
-
Set filteringIdMaxBytes to pre-specified report parameters map[batchingScope]'s filtering ID max bytes.
-
-
If contribution["
filteringId
"] is not contained in the range 0 to 256filteringIdMaxBytes, exclusive, return aRangeError
. -
Return a new contribution cache entry with the items:
- contribution
-
contribution
- batching scope
-
batchingScope
- debug scope
-
The result of running scopingDetails’ get debug scope steps.
Ensure errors are of an appropriate type, e.g. InvalidAccessError
is
deprecated.
9.2. Scheduling reports
-
Assert: reportingOrigin is a potentially trustworthy origin.
-
Optionally, return.
Note: This implementation-defined condition is intended to allow user agents to drop reports for a number of reasons, for example user opt-out, an origin not being enrolled or a limit on pending reports being reached.
-
Let currentWallTime be the current wall time.
-
Let allUnmergedContributions be the result of compiling all unmerged contributions, given reportingOrigin, api, pendingContributions, preSpecifiedParams, timeout and currentWallTime.
-
Let isDeterministicReport be the result of determining if a report should be sent deterministically given preSpecifiedParams and api.
-
Let effectiveMaxContributions be the result of determining the max contributions with api and preSpecifiedParams’ max contributions.
-
Let keptMergeKeys be a new set.
-
For each contribution of allUnmergedContributions:
-
Remove all items that have a merge key that is not contained in keptMergeKeys from allUnmergedContributions.
-
Let finalBudgetResults be the result of querying the budget given allUnmergedContributions, reportingOrigin, api, currentWallTime and true.
-
Assert: finalBudgetResults’ size equals allUnmergedContributions’ size.
-
For each i of the range 0 to finalBudgetResults’ size, exclusive:
-
If finalBudgetResults[i] is false, set allUnmergedContributions[i]'s
value
to 0.
-
-
Remove all items from allUnmergedContributions that have a
value
of 0. -
Let mergedContributionsMap be a new ordered map.
-
For each contribution of allUnmergedContributions:
-
Let mergedContributions be the mergedContributionsMap’s values.
-
Assert: mergedContributions has a size less than or equal to effectiveMaxContributions,
-
If mergedContributions is empty and isDeterministicReport is false, return.
-
Let report be the result of obtaining an aggregatable report given reportingOrigin, api, mergedContributions, debugDetails, aggregationCoordinator, preSpecifiedParams, timeout and currentWallTime.
-
Append report to the user agent’s aggregatable report cache.
PAHistogramContribution
s.
-
Let wasTimeoutReached be the boolean indicating whether both isDeterministicReport is true and timeout is null.
-
Record the internal error event result given pendingContributions, "
contribution-timeout-reached
" and wasTimeoutReached. -
Let provisionalBudgetResults be the result of querying the budget given pendingContributions’ unconditional contributions, reportingOrigin, api, currentWallTime and false.
-
Assert: provisionalBudgetResults’ size equals pendingContributions’ unconditional contributions' size.
-
For each i of the range 0 to provisionalBudgetResults’ size, exclusive:
-
If provisionalBudgetResults[i] is false, set pendingContributions’ unconditional contributions[i]'s
value
to 0.
-
-
Remove all items from pendingContributions’ unconditional contributions that have a
value
of 0. -
Let insufficientBudget be a boolean indicating whether any value in provisionalBudgetResults is false.
-
Record the internal error event result given pendingContributions, "
insufficient-budget
" and insufficientBudget. -
Let pendingReportLimitReached be a boolean determined by an implementation-defined algorithm.
Note: This is intended to indicate when a limit on the number of reports simultaneously pending on a budget query was reached (but not exceeded).
-
Record the internal error event result given pendingContributions, "
pending-report-limit-reached
" and pendingReportLimitReached. -
Let isEmptyAndWouldBeDropped be a boolean indicating whether both isDeterministicReport is false and pendingContributions’ unconditional contributions is empty.
-
Record the internal error event result given pendingContributions, "
empty-report-dropped
" and isEmptyAndWouldBeDropped. -
Let effectiveMaxContributions be the result of determining the max contributions with api and preSpecifiedParams’ max contributions.
-
Let tooManyContributions be false.
-
Let provisionallyApprovedMergeKeys be a new set.
-
For each contribution of pendingContributions’ unconditional contributions:
-
Record the internal error event result given pendingContributions, "
too-many-contributions
" and tooManyContributions. -
Remove all items that have a merge key that is not contained in provisionallyApprovedMergeKeys from pendingContributions’ unconditional contributions.
-
Let reportSuccess be the boolean indicating whether none of the following are true: isEmptyAndWouldBeDropped, tooManyContributions, insufficientBudget, pendingReportLimitReached.
-
Record the internal error event result given pendingContributions, "
report-success
" and reportSuccess. -
Let allUnmergedContributions be a new list.
-
For each errorEvent of all error events:
-
If pendingContributions’ conditional contributions[errorEvent] does not exist, continue.
-
Assert: pendingContributions’ triggered error events contains errorEvent or errorEvent is "
already triggered external error
".Note: When an internal error event is determined to not be triggered, its conditional contributions are removed by record the internal error event result
-
Extend allUnmergedContributions with pendingContributions’ conditional contributions[errorEvent].
-
-
Extend allUnmergedContributions with pendingContributions’ unconditional contributions.
Note: Unconditional contributions come last to prioritize successful measurement of errors.
-
Return allUnmergedContributions.
PAHistogramContribution
contribution has a merge key that
is the following tuple: (contribution’s bucket
, contribution’s filteringId
).
Note: Two PAHistogramContribution
s for the same report can be merged if and
only if they have the same merge key.
PAHistogramContribution
s contributions, an origin origin, a context type api and a moment currentTime and a boolean consumeIfPermitted, perform the following steps. They return a list of booleans with the same size as contributions.
-
Let approvedValueSum be 0.
-
For each contribution of contributions:
-
Let valueToRequest be approvedValueSum + contribution’s
value
.Note: This ensures that the result for each contribution takes into acccount any earlier approved contributions from this call (even if consumeIfPermitted is false).
-
Let sufficientBudget be a boolean determined by an implementation-defined algorithm given valueToRequest, origin, api and currentTime. This algorithm should bound budget to usage over time, e.g. the contribution sum over the last 24 hours.
-
If sufficientBudget, add contribution’s
value
to approvedValueSum. -
Append sufficientBudget to resultForEachContribution.
Note: the ith element in the return value indicates whether there is enough budget left to send contributions[i]'s
value
.
-
-
If consumeIfPermitted, consume budget using an implementation-defined algorithm given approvedValueSum, origin, api and currentTime.
-
Return resultForEachContribution.
-
If wasTriggered, append errorEvent to pendingContributions’ triggered error events.
-
Otherwise, remove pendingContributions’ conditional contributions[errorEvent].
PAHistogramContribution
s contributions, a debug details debugDetails, an aggregation coordinator aggregationCoordinator, a pre-specified report parameters preSpecifiedParams, a moment or null timeout and a moment currentTime,
perform the following steps. They return an aggregatable report.
-
Assert: reportingOrigin is a potentially trustworthy origin.
-
Let reportTime be the result of running obtain a report delivery time given currentTime and timeout.
-
Let report be a new aggregatable report with the items:
- reporting origin
-
reportingOrigin
- original report time
-
reportTime
- report time
-
reportTime
- contributions
-
contributions
- api
-
api
- report ID
-
The result of generating a random UUID.
- debug details
-
debugDetails
- aggregation coordinator
-
aggregationCoordinator
- context ID
-
preSpecifiedParams’ context ID
- filtering ID max bytes
-
preSpecifiedParams’ filtering ID max bytes
- max contributions
-
The result of determining the max contributions with api and preSpecifiedParams’ max contributions.
- queued
-
false
-
Return report.
-
If timeout is not null:
-
Return timeout.
-
-
If automation local testing mode enabled is true, return currentTime.
-
Let r be a random double between 0 (inclusive) and 1 (exclusive) with uniform probability.
-
Return currentTime + minimum report delay + r * randomized report delay.
-
If maxContributions is null, return default maxContributions by API[api].
-
If maxContributions is greater than maximum maxContributions, return maximum maxContributions.
-
Return maxContributions.
9.3. Sending reports
Note: This section is largely copied from the Attribution Reporting API spec, adapting as necessary.
Do we have to use the queue a task algorithm here?
The user agent must periodically attempt to queue reports for sending given its aggregatable report cache.
-
For each report of reports, run these steps in parallel:
-
Run these steps, but abort when the user agent shuts down:
-
If report’s queued value is true, return.
-
Set report’s queued value to true.
-
Let currentWallTime be the current wall time.
-
If report’s report time is before currentWallTime, set report’s report time to currentWallTime plus an implementation-defined random non-negative duration.
Note: On startup, it is possible the user agent will need to send many reports whose report times passed while the browser was closed. Adding random delay prevents temporal joining of reports.
-
Wait until the current wall time is equal to or after report’s report time.
-
Optionally, wait a further implementation-defined non-negative duration.
Note: This is intended to allow user agents to optimize device resource usage and wait for the user agent to be online.
-
Run attempt to deliver a report with report.
-
-
If aborted, set report’s queued value to false.
Note: It might be more practical to perform this step when the user agent next starts up.
-
-
Let url be the result of obtaining a reporting endpoint given report’s reporting origin and report’s api.
-
Let data be the result of serializing an aggregatable report given report.
-
If data is an error, remove report from the aggregatable report cache.
-
Let request be the result of creating a report request given url and data.
-
Queue a task to fetch request with processResponse being the following steps:
-
Let shouldRetry be an implementation-defined boolean. The value should be false if no error occurred.
-
If shouldRetry is true:
-
Set report’s report time to the current wall time plus an implementation-defined non-negative duration.
-
Set report’s queued value to false.
-
-
Otherwise, remove report from the aggregatable report cache.
-
-
Assert: reportingOrigin is a potentially trustworthy origin.
-
Let path be the concatenation of «"
.well-known/private-aggregation/report-
", api».Register this well-known directory. [Issue #67]
-
Let base be the result on running the URL parser on the serialization of reportingOrigin.
-
Assert: base is not failure.
-
Let result be the result of running the URL parser on path with base.
-
Assert: result is not failure.
-
Return result.
-
Let request be a new request with the following properties:
- method
-
"
POST
" - URL
-
url
- header list
-
«("
Content-Type
", "application/json
")» - unsafe-request flag
-
set
- body
-
body
- client
-
null
- window
-
"
no-window
" - service-workers mode
-
"
none
" - initiator
-
""
- referrer
-
"
no-referrer
" - mode
-
"
cors
" - credentials mode
-
"
omit
" - cache mode
-
"
no-store
"
-
Return request.
9.4. Serializing reports
Note: This section is largely copied from the Attribution Reporting API spec, adapting as necessary.
-
Let aggregationServicePayloads be the result of obtaining the aggregation service payloads given report.
-
If aggregationServicePayloads is an error, return aggregationServicePayloads.
-
Let data be an ordered map of the following key/value pairs:
- "
aggregation_coordinator_origin
" -
report’s aggregation coordinator, serialized.
- "
aggregation_service_payloads
" -
aggregationServicePayloads
- "
shared_info
" -
The result of obtaining a report’s shared info given report.
- "
-
Let debugKey be report’s debug details’s key.
-
If debugKey is not null, set data["
debug_key
"] to debugKey. -
Let contextId be report’s context ID.
-
If contextId is not null, set data["
context_id
"] to contextId. -
Return the byte sequence resulting from executing serialize an infra value to JSON bytes on data.
-
Let publicKeyTuple be the result of obtaining the public key for encryption given report’s aggregation coordinator.
-
If publicKeyTuple is an error, return publicKeyTuple.
-
Let (pkR, keyId) be publicKeyTuple.
-
Let plaintextPayload be the result of obtaining the plaintext payload given report.
-
Let sharedInfo be the result of obtaining a report’s shared info given report.
-
Let encryptedPayload be the result of encrypting the payload given plaintextPayload, pkR and sharedInfo.
-
If encryptedPayload is an error, return encryptedPayload.
-
Let aggregationServicePayloads be a new list.
-
Let aggregationServicePayload be an ordered map of the following key/value pairs:
- "
key_id
" -
keyId
- "
payload
" -
encryptedPayload, base64 encoded
- "
-
If report’s debug details’s enabled field is true:
-
Set aggregationServicePayload[
debug_cleartext_payload
] to plaintextPayload, base64 encoded.
-
-
Append aggregationServicePayload to aggregationServicePayloads.
-
Return aggregationServicePayloads.
-
Let url be a new URL record.
-
Set url’s path to «"
.well-known
", "aggregation-service
", "v1
", "public-keys
"». -
Return an implementation-defined tuple consisting of a public key from url and a string that should uniquely identify the public key or, in the event that the user agent failed to obtain the public key from url, an error. This step may be asynchronous.
Specify this in terms of fetch. Add details about which encryption standards to use, length requirements, etc.
Note: The user agent is encouraged to enforce regular key rotation. If there are multiple keys, the user agent can independently pick a key uniformly at random for every encryption operation.
-
Let payloadData be a new list.
-
Let contributions be report’s contributions.
-
Let maxContributions be report’s max contributions.
-
Assert: contributions’ size is not greater than maxContributions.
-
While contributions’ size is less than maxContributions:
-
Let nullContribution be a new
PAHistogramContribution
with the items:bucket
-
0
value
-
0
filteringId
-
0
-
Append nullContribution to contributions.
Note: This padding protects against the number of contributions being leaked through the encrypted payload size, see discussion below.
-
-
For each contribution of report’s contributions:
-
Let filteringIdMaxBytes be report’s filtering id max bytes.
-
Assert: contribution["
filteringId
"] is contained in the range 0 to 256filteringIdMaxBytes, exclusive. -
Let contributionData be an ordered map of the following key/value pairs:
- "
bucket
" -
The result of encoding an integer for the payload given contribution["
bucket
"] and 16. - "
value
" -
The result of encoding an integer for the payload given contribution["
value
"] and 4. - "
id
" -
The result of encoding an integer for the payload given contribution["
filteringId
"] and filteringIdMaxBytes.
- "
-
Append contributionData to payloadData.
-
-
Let payload be an ordered map of the following key/value pairs:
- "
data
" -
payloadData
- "
operation
" -
"
histogram
"
- "
-
Return the byte sequence resulting from CBOR encoding payload.
-
Let info be the result of UTF-8 encoding the concatenation of « "
aggregation_service
", sharedInfo ». -
Let (kem_id, kdf_id, aead_id) be (0x0020, 0x0001, 0x0003).
Note: The ciphersuite triple above is composed of HPKE algorithm identifiers, specifying the KEM as DHKEM(X25519, HKDF-SHA256), the KDF function as HKDF-SHA256 and the AEAD function as ChaCha20Poly1305.
-
Let (enc, hpkeContext) be the result of setting up an HPKE sender’s context by calling
SetupBaseS()
with a public key pkR, application-supplied information info, KEM kem_id, KDF kdf_id, and AEAD aead_id. If this operation fails, return an error.Note: For clarity, we explicitly passed the KEM, KDF, and AEAD identifiers to
SetupBaseS()
above, even though RFC9180 omits the parameters from its pseudocode. -
Let aad be `` (an empty byte sequence).
-
Let ciphertext be the result of sealing the payload by calling
ContextS.Seal()
on the hpkeContext object with additional authenticated data aad and plaintext plaintextPayload. If this operation fails, return an error. -
Let encryptedPayload be the concatenation of the byte sequences « enc, ciphertext ».
Note: The length of the encapsulated symmetric key enc generated by our chosen KEM is exactly 32 bytes, as shown in RFC9180’s table of KEM IDs.
-
Return the byte sequence encryptedPayload.
-
Let scheduledReportTime be the duration from the UNIX epoch to report’s original report time.
-
Let sharedInfo be an ordered map of the following key/value pairs:
- "
api
" -
report’s api
- "
report_id
" -
report’s report ID
- "
reporting_origin
" -
The serialization of report’s reporting origin
- "
scheduled_report_time
" -
The number of seconds in scheduledReportTime, rounded down to the nearest number of whole seconds and serialized
- "
version
" -
"
1.0
"
- "
-
Return the result of serializing an infra value to a json string given sharedInfo.
10. User-agent automation
A user agent holds a boolean automation local testing mode enabled (default false).
For the purposes of user-agent automation and website testing, this document defines the below [WebDriver] extension commands to control the API configuration.
10.1. Set local testing mode
HTTP Method | URI Template |
---|---|
POST | /session/{session id}/private-aggregation/ localtestingmode
|
-
If parameters is not a JSON-formatted Object, return a WebDriver error with error code invalid argument.
-
Let enabled be the result of getting a property named
"enabled"
from parameters. -
If enabled is
undefined
or is not a boolean, return a WebDriver error with error code invalid argument. -
Set automation local testing mode enabled to enabled.
-
Return success with data
null
.
Note: Without this, aggregatable reports would be subject to delays, making testing difficult.
11. Privacy considerations
This section is non-normative.
11.1. Cross-site information disclosure
This API lets isolated contexts with access to cross-site data (i.e. Shared Storage worklets/Protected Audience script runners) send aggregatable reports over the network.
Aggregatable reports contain encrypted high entropy cross-site information, in the form of key-value pairs (i.e. contributions to a histogram). The information embedded in the contributions is arbitrary but can include things like browsing history and other cross-site activity. The API aims to protect this information from being passed from one site to another.
11.1.1. Restricted contribution processing
The histogram contributions are not exposed directly. Instead, they are encrypted so that they can only be processed by a trusted aggregation service. This trusted aggregation service sums the values across the reports for each key and adds noise to each of these values to produce ‘summary reports’.
The output of that processing will be an aggregated, noised histogram. The service ensures that any report can not be processed multiple times. Further, information exposure is limited by contribution budgets on the user agent. In principle, this framework can support specifying a noise parameter which satisfies differential privacy.
11.1.2. Unencrypted metadata
These reports also expose a limited amount of metadata, which is not based on cross-site data. The recipient of the report may also be able to observe side-channel information such as the time when the report was sent, or IP address of the sender.
11.1.3. Protecting against leaks via the number of reports
However, the number of reports with the given metadata could expose some cross-site information. To protect against this, the API delays sending reports by a randomized amount of time to make it difficult to determine whether a report was sent or not from any particular event. In the case that a context ID is supplied, a non-default filtering ID max bytes is specified, or a non-default max contributions is specified, the API makes the number of reports sent deterministic (sending 'null reports' if necessary — each containing only a contribution with a value of 0 in the payload). Additional mitigations may also be possible in the future, e.g. adding noise to the report count.
11.1.4. Protecting against leaks via payload size
The length of the encrypted payload could additionally expose some cross-site information, namely the number of contributions present in the plaintext payload. To eliminate this side channel, Private Aggregation ensures that payloads contain a predetermined number of contributions prior to encryption, potentially truncating or padding with null contributions to match the target.
When max contributions is non-null, Private Aggregation uses it to inform the target number of contributions. Otherwise, the target number is drawn from default maxContributions by API based on the caller’s context type.
11.1.5. Temporary debugging mechanism
The enableDebugMode()
method allows for many of the
protections of this API to be bypassed to ease testing and integration.
Specifically, the contents of the payload, i.e. the histogram contributions, are
revealed in the clear when the debug mode is enabled. Optionally, a debug key
can also be set to associate the report with the calling context. In the future,
this mechanism will only be available for callers that are eligible to set
third-party cookies. In that case, the API caller already has the ability to
communicate information cross-site.
Tie enableDebugMode()
to third-party cookie
eligibility. [Issue #57]
11.1.6. Privacy parameters
The amount of information exposed by this API is a product of the privacy parameters used (e.g. contribution limits and the noise distribution used in the aggregation service). While we aim to minimize the amount of information exposed, we also aim to support a wide range of use cases. The privacy parameters are left implementation-defined to allow different and evolving choices in the tradeoffs between information exposure and utility.
11.2. Clearing site data
The aggregatable report cache as well as any contribution history data stored for the query the budget algorithm contain data about a user’s web activity. As such, user controls to delete this data are required, see clearing storage.
On the other hand, the contribution cache, the debug scope map and the pre-specified report parameters map only contain short-lived data tied to particular batching scopes and debug scopes, so controls are not required.
11.3. Reporting delay concerns
Delaying sending reports after API invocation can enable side-channel leakage in some situations.
11.3.1. Cross-network reporting origin leakage
A report may be stored while the browser is connected to one network but sent while the browser is connected to a different network, potentially enabling cross-network leakage of the reporting origin.
Example: A user runs the browser with a particular browsing profile on their home network. An aggregatable report with a particular reporting origin is stored with a report time in the future. After the report time is reached, the user runs the browser with the same browsing profile on their employer’s network, at which point the browser sends the report to the reporting origin. Although the report itself may be sent over HTTPS, the reporting origin may be visible to the network administrator via DNS or the TLS client hello (which can be mitigated with ECH). Some reporting origins may be known to operate only or primarily on sensitive sites, so this could leak information about the user’s browsing activity to the user’s employer without their knowledge or consent.
Possible mitigations include:
-
Only sending reports with a given reporting origin when the browser has already made a request to that origin on the same network: This prevents the network administrator from gaining additional information from the Private Aggregation API. However, it increases report loss and report delays, which reduces the utility of the API for the reporting origin. It might also increase the effectiveness of timing attacks, as the origin may be able to better link the report with the user’s request that allowed the report to be released.
-
Send reports immediately: This reduces the likelihood of a report being stored and sent on different networks. However, it increases the likelihood that the reporting origin can correlate the original API invocation to the report being sent, which weakens the privacy controls of the API, see Protecting against leaks via the number of reports.
-
Use a trusted proxy server to send reports: This effectively moves the reporting origin into the report body, so only the proxy server would be visible to the network administrator.
-
Require DNS over HTTPS: This effectively hides the reporting origin from the network administrator, but is likely impractical to enforce and is itself perhaps circumventable by the network administrator, e.g. by monitoring IP addresses instead.
11.3.2. User-presence tracking
The browser only tries to send reports while it is running and while it has internet connectivity (even without an explicit check for connectivity, naturally the report will fail to be sent if there is none), so receiving or not receiving a (serialized) aggregatable report at the original report time leaks information about the user’s presence. Additionally, because the report request inherently includes an IP address, this could reveal the user’s IP-derived location to the reporting origin, including at-home vs. at-work or approximate real-world geolocation, or reveal patterns in the user’s browsing activity.
Possible mitigations include:
-
Send reports immediately: This effectively eliminates the presence tracking, as the original request made to the reporting origin is in close temporal proximity to the report request. However, it increases the likelihood that the reporting origin can correlate the original API invocation to the report being sent, which weakens the privacy controls of the API, see Protecting against leaks via the number of reports.
-
Send reports immediately to a trusted proxy server, which would itself apply additional delay: This would effectively hide both the user’s IP address and their online-offline presence from the reporting origin.
12. Security considerations
This section is non-normative.
12.1. Same-origin policy
Writes to the aggregatable report cache, contribution cache, debug scope map and pre-specified report parameters map are attributed to the reporting origin and the data included in any report with a given reporting origin are generated with only data from that origin.
One notable exception is the query the budget algorithm which is implementation-defined and can consider contribution history from other origins. For example, the algorithm could consider all history from a particular site. This would be an explicit relaxation of the same-origin policy as multiple origins would be able to influence the API’s behavior. One particular risk of these kinds of shared limits is the introduction of denial of service attacks, where a group of origins could collude to intentionally consume all available budget, causing subsequent origins to be unable to access the API. This trades off security for privacy, in that the limits are there to reduce the efficacy of many origins colluding together to violate privacy. However, this security risk is lessened if the set of origins limited are all same site. User agents should consider these tradeoffs when choosing the query the budget algorithm.
12.2. Protecting the histogram contributions
As discussed above, the processing of histogram contributions is limited to protect privacy. This limitation relies on only the trusted aggregation service being able to access the unencrypted histogram contributions.
To ensure this, this API uses HPKE, a modern encryption specification. Additionally, each user agent is encouraged to require regular key rotation by the aggregation service. This limits the amount of data encrypted with the same key and thus the amount of vulnerable data in the case of a key being compromised.
While not specified here, each user agent is strongly encouraged to consider the security of any aggregation service design before allowing its public keys to be returned by obtain the public key for encryption.