Billing & Usage Guide

Understand how billing works and manage your costs effectively.

Billing Model

Packet.ai uses pay-as-you-go billing with prepaid credits:

Service	Billing Unit	Notes
GPU Compute	Per hour	Billed while running, prorated by minute
Token Factory	Per 1K tokens	Input and output tokens billed separately
Persistent Storage	Per GB/hour	Billed continuously while volume exists
Network	Free	No bandwidth charges

GPU Pricing

GPU pricing varies by type, region, and availability. Current rates:

GPU Type	VRAM	Price/Hour	Best For
RTX 4090	24GB	$0.50 - $0.80	Inference, 7B models, training
A100 40GB	40GB	$1.50 - $2.00	Large models, training, fine-tuning
A100 80GB	80GB	$2.00 - $3.00	70B models, large batch training
H100	80GB	$3.00 - $4.00	Maximum performance, production

Prices shown are approximate and vary by region. See current pricing in the Launch GPU modal.

Token Factory Pricing

Token Factory offers pay-per-token pricing with discounts for batch processing:

Real-Time Inference

Token Type	Price per 1M Tokens
Input Tokens	$0.03
Output Tokens	$0.06

Batch Processing (50% Discount)

Batch Type	Input per 1M	Output per 1M	Turnaround
1-Hour Batch	$0.02	$0.04	Results within 1 hour
24-Hour Batch	$0.015	$0.03	Results within 24 hours

Other Token Factory Services

Service	Price
Embeddings	$0.02 per 1M tokens
LoRA Training	$3.00 per 1K training tokens

Cost Example

Processing 1,000 chat requests with average 500 input tokens and 200 output tokens:

Input: 500K tokens × $0.03/1M = $0.015
Output: 200K tokens × $0.06/1M = $0.012
Total: $0.027 (less than 3 cents for 1,000 requests)

Storage Pricing

Storage Type	Price	Notes
Ephemeral (Local NVMe)	Free	Included with GPU, cleared on restart
Persistent (NFS)	$0.10/GB/month	Survives restarts, ~$0.00014/GB/hour

Example: 100GB persistent storage costs approximately $10/month or $0.33/day.

Payment Methods

Prepaid Balance

Add funds to your account:

Go to Billing in the sidebar
Click Add Funds
Choose an amount:

Amount	Bonus	Total Credits
$25	-	$25
$50	-	$50
$100	5% bonus	$105
$250	10% bonus	$275
Custom	Varies	Contact us for volume discounts

Funds are available immediately after payment.

Voucher Codes

Redeem promotional vouchers for bonus credits:

Go to Billing in the sidebar
Click Redeem Voucher
Enter your voucher code
Bonus credits are added to your balance

Understanding Your Bill

Current Balance

Your balance in the sidebar shows your available credits. This updates in real-time as you use resources.

Usage Breakdown

View detailed usage in Billing:

GPU Hours: Time each GPU was running
Token Usage: Input/output tokens for Token Factory
Storage Hours: Persistent storage usage
Cost by Resource: Breakdown by instance/service

Transaction History

The Billing page shows:

Credits added (payments, vouchers)
Credits used (GPU, tokens, storage)
Current balance
Pending charges

Cost Management Tips

1. Stop Unused Instances

Stopped instances don't incur GPU charges:

State	GPU Billing	Storage Billing
Running	Active	Active
Stopped	Paused	Active (if persistent)
Terminated	Stopped	Stopped

Use Stop when not actively training or running inference.

2. Right-Size Your GPU

Don't overprovision. Match GPU count to your workload:

Start with fewer GPUs
Scale up if needed using the Scale feature
Use the GPU sizing guides in documentation

3. Use Ephemeral Storage When Possible

Persistent storage incurs ongoing charges. Use it only for:

Training checkpoints you need to keep
Datasets you'll reuse
Model weights you're iterating on

4. Use Batch Processing

For non-real-time workloads, use Token Factory batch processing:

50% cheaper than real-time
Perfect for data processing, evaluations, bulk generation
Results delivered within 1 or 24 hours

5. Terminate When Done

When you're finished with a project, terminate the instance to stop all charges (including storage).

Warning

Terminating an instance deletes all data permanently. Make sure to save important files first.

Billing Cycle

How Charges Accumulate

Start instance: Billing begins immediately
Running: Charges accumulate per minute
Stop instance: GPU billing pauses immediately
Terminate: All charges stop

Proration

Charges are prorated by the minute. If you run a GPU for 30 minutes, you're charged for 30 minutes (not a full hour).

Real-Time Deduction

Credits are deducted from your balance in real-time. Your balance updates every few minutes to reflect current usage.

Low Balance Warnings

Keep your balance healthy to avoid interruptions:

$10 warning: Consider adding funds soon
$5 warning: Add funds to avoid interruption
$0 balance: Running instances may be paused

We'll email you when your balance is low. Add funds promptly to keep your workloads running.

FAQ

When am I charged?

Billing starts when you launch a GPU or make Token Factory API calls. Charges are deducted from your prepaid balance in real-time.

What happens if I run out of credits?

Running instances may be paused. Add funds to resume immediately. Your data is preserved for a grace period.

Do stopped instances cost money?

GPU charges stop when you stop an instance. Persistent storage continues to be billed while the volume exists.

How do I see my usage history?

Go to Billing to view your usage breakdown, transaction history, and costs by resource.

Can I get a refund?

Unused prepaid credits don't expire and can be refunded within 30 days of purchase. Contact support for refund requests.

Do you offer enterprise pricing?

Yes! Contact enterprise@packet.ai for volume discounts, reserved capacity, and custom SLAs.

Need Help?

For billing questions, contact support@packet.ai