Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
7b4c635
gsoc26: Layer 2 with tests initial commit
Jun 5, 2026
5d224f6
gsoc26: Refactor Layer 2 with handler architecture and improved tests
Jun 6, 2026
f2fe92e
Improvements to Format layer implementation
Jun 10, 2026
180c255
Review comments resolved under issue #59
Jun 12, 2026
34039f2
#61: replacing --convert_from & --convert_to with --compression
Jun 13, 2026
cd5e990
gsoc26: layer2 complete + implementation for #61
Jun 14, 2026
508bf86
gsoc2026: mapping layer initial commit
Jun 15, 2026
9aba4f6
fix: errors from testing fixed.
Jun 16, 2026
549ea3b
docs: update README for --format and --compression flags
Jun 16, 2026
2c50924
docs: update README for --format and --compression flags
Jun 16, 2026
4ff66d9
fix: resolve PR review comments.
Jun 19, 2026
9a541ee
Merge branch 'feature/format-conversion' into gsoc-2026
Jun 20, 2026
6c0200e
fix: remove duplicate import after merging layer2 fixes into layer3
Jun 20, 2026
d5db670
docs: add Layer 3 mapping conversion flags and examples
Jun 20, 2026
cac241f
merge: sync with upstream/gsoc-2026 after PR #62 merge
Jun 23, 2026
5c905f5
Initial commit: Manifest System
Jun 26, 2026
6a3f728
fix: move all test data to tests/resources, add round trip IR compari…
Jun 28, 2026
092702a
fix: align Quad->Triple->Quad round trip test as required.
Jun 29, 2026
3185c6b
fix: fixed review comments
Jun 29, 2026
9948f58
fix: fixed review comments(2)
Jun 29, 2026
1836d50
Merge branch 'gsoc-2026' of https://github.com/dbpedia/databus-python…
Jun 29, 2026
6a9c5e1
Merge branch 'gsoc-2026' into dev
Jun 29, 2026
9bfcc21
Complete Implementation of Milestone 2
Jun 30, 2026
ebe0fe4
edge case handled
Jun 30, 2026
393aa6c
docs: add Manifest section to README documenting --manifest flag
Jun 30, 2026
485bd41
feat: capture operation-level errors in manifest via dbus:operationError
Jul 2, 2026
9363657
fix: resolve PR review comments
Jul 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 51 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Command-line and Python client for downloading and deploying datasets on DBpedia
- [Download](#cli-download)
- [Deploy](#cli-deploy)
- [Delete](#cli-delete)
- [Manifest](#cli-manifest)
- [Module Usage](#module-usage)
- [Deploy](#module-deploy)
- [Development & Contributing](#development--contributing)
Expand Down Expand Up @@ -556,6 +557,55 @@ databusclient delete https://databus.dbpedia.org/dbpedia/collections/dbpedia-sna
docker run --rm -v $(pwd):/data dbpedia/databus-python-client delete https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-12 --databus-key YOUR_API_KEY
```

<a id="cli-manifest"></a>
### Manifest

All three commands support an optional `--manifest` flag that writes a structured JSON-LD record of the operation to disk:

**Download**
```bash
# Python
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01/mappingbased-literals_lang=az.ttl.bz2 --manifest ./manifests/download-run.jsonld
# Docker
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01/mappingbased-literals_lang=az.ttl.bz2 --manifest ./manifests/download-run.jsonld
```

**Deploy**
```bash
# Python
databusclient deploy \
--version-id https://databus.dbpedia.org/user1/group1/artifact1/2022-05-18 \
--title "Client Testing" --abstract "Testing the client...." \
--description "Testing the client...." \
--license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 \
--apikey YOUR_KEY --manifest ./manifests/deploy-run.jsonld \
'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
# Docker
docker run --rm -v $(pwd):/data dbpedia/databus-python-client deploy \
--version-id https://databus.dbpedia.org/user1/group1/artifact1/2022-05-18 \
--title "Client Testing" --abstract "Testing the client...." \
--description "Testing the client...." \
--license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 \
--apikey YOUR_KEY --manifest ./manifests/deploy-run.jsonld \
'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
```
**Delete**
```bash
# Python
databusclient delete https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01 --databus-key YOUR_API_KEY --manifest ./manifests/delete-run.jsonld
# Docker
docker run --rm -v $(pwd):/data dbpedia/databus-python-client delete https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01 --databus-key YOUR_API_KEY --manifest ./manifests/delete-run.jsonld
```

The manifest records input parameters, per-file URLs, checksums, byte sizes, timestamps, and success/failure status for each file. It uses the DataID vocabulary and is versioned via `dbus:schemaVersion`.

- If the target path already exists, the manifest is written to an auto-suffixed path (e.g. `run_1.jsonld`) with a warning.
- Sensitive fields (API keys, vault tokens) are never written.
- If manifest writing fails, a warning is printed and the exit code reflects the actual operation result.
- If the operation itself fails, a `dbus:operationError` block is recorded in the manifest capturing the error type, message, and traceback.

See `examples/reproducible-download.md` for a full walkthrough.

## Module Usage

<a id="module-deploy"></a>
Expand Down Expand Up @@ -675,4 +725,4 @@ Or to ensure compatibility with the `pyproject.toml` configured dependencies, ru

```bash
poetry run pytest tests/
```
```
37 changes: 25 additions & 12 deletions databusclient/api/delete.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,16 @@ class DeleteQueue:
Allows adding multiple databus URIs to a queue and executing their deletion in batch.
"""

def __init__(self, databus_key: str):
def __init__(self, databus_key: str, manifest_context=None):
"""Create a DeleteQueue bound to a given Databus API key.

Args:
databus_key: API key used to authenticate deletion requests.
manifest_context: Optional ManifestContext to record deletion
outcomes into. Passed through to _delete_list on execute().
"""
self.databus_key = databus_key
self.manifest_context = manifest_context
self.queue: set[str] = set()

def add_uri(self, databusURI: str):
Expand Down Expand Up @@ -69,11 +72,13 @@ def execute(self):
"""Execute all queued deletions.

Each queued URI will be deleted using `_delete_resource`.
Passes manifest_context through so deletions are recorded.
"""
_delete_list(
list(self.sorted_queue()),
self.databus_key,
force=True,
manifest_context=self.manifest_context,
)


Expand Down Expand Up @@ -116,6 +121,7 @@ def _delete_resource(
dry_run: bool = False,
force: bool = False,
queue: DeleteQueue = None,
manifest_context=None,
):
"""Delete a single Databus resource (version, artifact, group).

Expand Down Expand Up @@ -144,6 +150,8 @@ def _delete_resource(

if dry_run:
print(f"[DRY RUN] Would delete: {databusURI}")
if manifest_context is not None:
manifest_context.record_file(url=databusURI, status="dry_run")
return

if queue is not None:
Expand All @@ -156,6 +164,8 @@ def _delete_resource(

if response.status_code in (200, 204):
print(f"Successfully deleted: {databusURI}")
if manifest_context is not None:
manifest_context.record_file(url=databusURI, status="success")
else:
raise Exception(
f"Failed to delete {databusURI}: {response.status_code} - {response.text}"
Expand All @@ -168,6 +178,7 @@ def _delete_list(
dry_run: bool = False,
force: bool = False,
queue: DeleteQueue = None,
manifest_context=None,
):
"""Delete a list of Databus resources.

Expand All @@ -180,7 +191,7 @@ def _delete_list(
"""
for databusURI in databusURIs:
_delete_resource(
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue, manifest_context=manifest_context
)


Expand All @@ -190,6 +201,7 @@ def _delete_artifact(
dry_run: bool = False,
force: bool = False,
queue: DeleteQueue = None,
manifest_context=None,
):
"""Delete an artifact and all its versions.

Expand Down Expand Up @@ -223,11 +235,11 @@ def _delete_artifact(
else:
# Delete all versions
_delete_list(
version_uris, databus_key, dry_run=dry_run, force=force, queue=queue
version_uris, databus_key, dry_run=dry_run, force=force, queue=queue, manifest_context=manifest_context
)

# Finally, delete the artifact itself
_delete_resource(databusURI, databus_key, dry_run=dry_run, force=force, queue=queue)
_delete_resource(databusURI, databus_key, dry_run=dry_run, force=force, queue=queue,manifest_context=manifest_context)


def _delete_group(
Expand All @@ -236,6 +248,7 @@ def _delete_group(
dry_run: bool = False,
force: bool = False,
queue: DeleteQueue = None,
manifest_context=None,
):
"""Delete a group and all its artifacts and versions.

Expand Down Expand Up @@ -266,14 +279,14 @@ def _delete_group(
# Delete all artifacts (which deletes their versions)
for artifact_uri in artifact_uris:
_delete_artifact(
artifact_uri, databus_key, dry_run=dry_run, force=force, queue=queue
artifact_uri, databus_key, dry_run=dry_run, force=force, queue=queue, manifest_context=manifest_context
)

# Finally, delete the group itself
_delete_resource(databusURI, databus_key, dry_run=dry_run, force=force, queue=queue)
_delete_resource(databusURI, databus_key, dry_run=dry_run, force=force, queue=queue,manifest_context=manifest_context)


def delete(databusURIs: List[str], databus_key: str, dry_run: bool, force: bool):
def delete(databusURIs: List[str], databus_key: str, dry_run: bool, force: bool, manifest_context=None):
"""Delete a dataset from the databus.

Delete a group, artifact, or version identified by the given databus URI.
Expand All @@ -286,7 +299,7 @@ def delete(databusURIs: List[str], databus_key: str, dry_run: bool, force: bool)
force: If True, skip confirmation prompt and proceed with deletion.
"""

queue = DeleteQueue(databus_key)
queue = DeleteQueue(databus_key, manifest_context=manifest_context)

for databusURI in databusURIs:
_host, _account, group, artifact, version, file = (
Expand All @@ -296,24 +309,24 @@ def delete(databusURIs: List[str], databus_key: str, dry_run: bool, force: bool)
if group == "collections" and artifact is not None:
print(f"Deleting collection: {databusURI}")
_delete_resource(
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue, manifest_context=manifest_context
)
elif file is not None:
print(f"Deleting file is not supported via API: {databusURI}")
elif version is not None:
print(f"Deleting version: {databusURI}")
_delete_resource(
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue, manifest_context=manifest_context
)
elif artifact is not None:
print(f"Deleting artifact and all its versions: {databusURI}")
_delete_artifact(
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue, manifest_context=manifest_context
)
elif group is not None and group != "collections":
print(f"Deleting group and all its artifacts and versions: {databusURI}")
_delete_group(
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue
databusURI, databus_key, dry_run=dry_run, force=force, queue=queue, manifest_context=manifest_context
)
else:
print(f"Deleting {databusURI} is not supported.")
Expand Down
Loading
Loading