Billing & Usage Guide
Understand how billing works and manage your costs effectively.
Billing Model
Packet.ai uses pay-as-you-go billing with prepaid credits:
| Service | Billing Unit | Notes |
|---|---|---|
| GPU Compute | Per hour | Billed while running, prorated by minute |
| Token Factory | Per 1K tokens | Input and output tokens billed separately |
| Persistent Storage | Per GB/hour | Billed continuously while volume exists |
| Network | Free | No bandwidth charges |
GPU Pricing
GPU pricing varies by type, region, and availability. Current rates:
| GPU Type | VRAM | Price/Hour | Best For |
|---|---|---|---|
| RTX 4090 | 24GB | $0.50 - $0.80 | Inference, 7B models, training |
| A100 40GB | 40GB | $1.50 - $2.00 | Large models, training, fine-tuning |
| A100 80GB | 80GB | $2.00 - $3.00 | 70B models, large batch training |
| H100 | 80GB | $3.00 - $4.00 | Maximum performance, production |
Prices shown are approximate and vary by region. See current pricing in the Launch GPU modal.
Token Factory Pricing
Token Factory offers pay-per-token pricing with discounts for batch processing:
Real-Time Inference
| Token Type | Price per 1M Tokens |
|---|---|
| Input Tokens | $0.03 |
| Output Tokens | $0.06 |
Batch Processing (50% Discount)
| Batch Type | Input per 1M | Output per 1M | Turnaround |
|---|---|---|---|
| 1-Hour Batch | $0.02 | $0.04 | Results within 1 hour |
| 24-Hour Batch | $0.015 | $0.03 | Results within 24 hours |
Other Token Factory Services
| Service | Price |
|---|---|
| Embeddings | $0.02 per 1M tokens |
| LoRA Training | $3.00 per 1K training tokens |
Cost Example
Processing 1,000 chat requests with average 500 input tokens and 200 output tokens:
- Input: 500K tokens × $0.03/1M = $0.015
- Output: 200K tokens × $0.06/1M = $0.012
- Total: $0.027 (less than 3 cents for 1,000 requests)
Storage Pricing
| Storage Type | Price | Notes |
|---|---|---|
| Ephemeral (Local NVMe) | Free | Included with GPU, cleared on restart |
| Persistent (NFS) | $0.10/GB/month | Survives restarts, ~$0.00014/GB/hour |
Example: 100GB persistent storage costs approximately $10/month or $0.33/day.
Payment Methods
Prepaid Balance
Add funds to your account:
- Go to Billing in the sidebar
- Click Add Funds
- Choose an amount:
| Amount | Bonus | Total Credits |
|---|---|---|
| $25 | - | $25 |
| $50 | - | $50 |
| $100 | 5% bonus | $105 |
| $250 | 10% bonus | $275 |
| Custom | Varies | Contact us for volume discounts |
Funds are available immediately after payment.
Voucher Codes
Redeem promotional vouchers for bonus credits:
- Go to Billing in the sidebar
- Click Redeem Voucher
- Enter your voucher code
- Bonus credits are added to your balance
Understanding Your Bill
Current Balance
Your balance in the sidebar shows your available credits. This updates in real-time as you use resources.
Usage Breakdown
View detailed usage in Billing:
- GPU Hours: Time each GPU was running
- Token Usage: Input/output tokens for Token Factory
- Storage Hours: Persistent storage usage
- Cost by Resource: Breakdown by instance/service
Transaction History
The Billing page shows:
- Credits added (payments, vouchers)
- Credits used (GPU, tokens, storage)
- Current balance
- Pending charges
Cost Management Tips
1. Stop Unused Instances
Stopped instances don't incur GPU charges:
| State | GPU Billing | Storage Billing |
|---|---|---|
| Running | Active | Active |
| Stopped | Paused | Active (if persistent) |
| Terminated | Stopped | Stopped |
Use Stop when not actively training or running inference.
2. Right-Size Your GPU
Don't overprovision. Match GPU count to your workload:
- Start with fewer GPUs
- Scale up if needed using the Scale feature
- Use the GPU sizing guides in documentation
3. Use Ephemeral Storage When Possible
Persistent storage incurs ongoing charges. Use it only for:
- Training checkpoints you need to keep
- Datasets you'll reuse
- Model weights you're iterating on
4. Use Batch Processing
For non-real-time workloads, use Token Factory batch processing:
- 50% cheaper than real-time
- Perfect for data processing, evaluations, bulk generation
- Results delivered within 1 or 24 hours
5. Terminate When Done
When you're finished with a project, terminate the instance to stop all charges (including storage).
Warning
Terminating an instance deletes all data permanently. Make sure to save important files first.
Billing Cycle
How Charges Accumulate
- Start instance: Billing begins immediately
- Running: Charges accumulate per minute
- Stop instance: GPU billing pauses immediately
- Terminate: All charges stop
Proration
Charges are prorated by the minute. If you run a GPU for 30 minutes, you're charged for 30 minutes (not a full hour).
Real-Time Deduction
Credits are deducted from your balance in real-time. Your balance updates every few minutes to reflect current usage.
Low Balance Warnings
Keep your balance healthy to avoid interruptions:
- $10 warning: Consider adding funds soon
- $5 warning: Add funds to avoid interruption
- $0 balance: Running instances may be paused
We'll email you when your balance is low. Add funds promptly to keep your workloads running.
FAQ
When am I charged?
Billing starts when you launch a GPU or make Token Factory API calls. Charges are deducted from your prepaid balance in real-time.
What happens if I run out of credits?
Running instances may be paused. Add funds to resume immediately. Your data is preserved for a grace period.
Do stopped instances cost money?
GPU charges stop when you stop an instance. Persistent storage continues to be billed while the volume exists.
How do I see my usage history?
Go to Billing to view your usage breakdown, transaction history, and costs by resource.
Can I get a refund?
Unused prepaid credits don't expire and can be refunded within 30 days of purchase. Contact support for refund requests.
Do you offer enterprise pricing?
Yes! Contact enterprise@packet.ai for volume discounts, reserved capacity, and custom SLAs.
Need Help?
For billing questions, contact support@packet.ai
