Skip to content

Conversation

@huksley
Copy link
Contributor

@huksley huksley commented Dec 19, 2025

This adds support for running clusters using Verda Cloud official SDK.

Also fixing so the default BASE_URL is https://api.verda.com/v1

It is simple as follows:

ssh_key = verda_client.ssh_keys.get()[0]
cluster = verda_client.clusters.create(
    hostname='test-cluster',
    location=Locations.FIN_03,
    cluster_type='16H200',
    description='my machine learning cluster 2x8H200',
    image='ubuntu-22.04-cuda-12.4-cluster',
    ssh_key_ids=[ssh_key.id],
)

also implemented:

  • verda_client.clusters.is_available('16B200', Locations.FIN_03): bool
  • verda_client.clusters.get_availabilities(Locations.FIN_03): list[str]
  • verda_client.clusters.get_cluster_images('16B200'): list[str]

@huksley huksley requested a review from shamrin December 19, 2025 15:48
claude and others added 4 commits December 19, 2025 17:55
Implemented comprehensive support for the Clusters API:
- Created ClustersService with methods for cluster management (create, get, delete, scale)
- Added Cluster and ClusterNode dataclasses for type safety
- Integrated clusters service into VerdaClient
- Added ClusterStatus constants for cluster lifecycle management
- Created comprehensive unit tests (13 tests covering all API operations)
- Added detailed example demonstrating cluster operations
- All tests pass (125/125)
@huksley huksley requested a review from tamirse December 22, 2025 10:01
f'Cluster {id} did not enter creating state within {max_wait_time:.1f} seconds'
)

interval = min(initial_interval * backoff_coefficient**i, max_interval, deadline - now)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is duplicated in instances.py so can be extracted to be re-used by both files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly I don't know how to make it reusable because it checks either cluster status or instance status

print(cluster)

# delete instance
# verda_client.clusters.action(cluster.id, 'delete')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uncomment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't delete really until it is running but this is 20 minutes wait usually

@huksley huksley requested a review from tamirse December 23, 2025 07:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants