Back to Docs

Billing & Usage

Understand pricing and track usage

Billing & Usage Guide

Understand how billing works and manage your costs effectively.

Billing Model

Packet.ai uses pay-as-you-go billing with prepaid credits:

ServiceBilling UnitNotes
GPU ComputePer hourBilled while running, prorated by minute
Token FactoryPer 1K tokensInput and output tokens billed separately
Persistent StoragePer GB/hourBilled continuously while volume exists
NetworkFreeNo bandwidth charges

GPU Pricing

GPU pricing varies by type, region, and availability. Current rates:

GPU TypeVRAMPrice/HourBest For
RTX 409024GB$0.50 - $0.80Inference, 7B models, training
A100 40GB40GB$1.50 - $2.00Large models, training, fine-tuning
A100 80GB80GB$2.00 - $3.0070B models, large batch training
H10080GB$3.00 - $4.00Maximum performance, production

Prices shown are approximate and vary by region. See current pricing in the Launch GPU modal.

Token Factory Pricing

Token Factory offers pay-per-token pricing with discounts for batch processing:

Real-Time Inference

Token TypePrice per 1M Tokens
Input Tokens$0.03
Output Tokens$0.06

Batch Processing (50% Discount)

Batch TypeInput per 1MOutput per 1MTurnaround
1-Hour Batch$0.02$0.04Results within 1 hour
24-Hour Batch$0.015$0.03Results within 24 hours

Other Token Factory Services

ServicePrice
Embeddings$0.02 per 1M tokens
LoRA Training$3.00 per 1K training tokens

Cost Example

Processing 1,000 chat requests with average 500 input tokens and 200 output tokens:

  • Input: 500K tokens × $0.03/1M = $0.015
  • Output: 200K tokens × $0.06/1M = $0.012
  • Total: $0.027 (less than 3 cents for 1,000 requests)

Storage Pricing

Storage TypePriceNotes
Ephemeral (Local NVMe)FreeIncluded with GPU, cleared on restart
Persistent (NFS)$0.10/GB/monthSurvives restarts, ~$0.00014/GB/hour

Example: 100GB persistent storage costs approximately $10/month or $0.33/day.

Payment Methods

Prepaid Balance

Add funds to your account:

  1. Go to Billing in the sidebar
  2. Click Add Funds
  3. Choose an amount:
AmountBonusTotal Credits
$25-$25
$50-$50
$1005% bonus$105
$25010% bonus$275
CustomVariesContact us for volume discounts

Funds are available immediately after payment.

Voucher Codes

Redeem promotional vouchers for bonus credits:

  1. Go to Billing in the sidebar
  2. Click Redeem Voucher
  3. Enter your voucher code
  4. Bonus credits are added to your balance

Understanding Your Bill

Current Balance

Your balance in the sidebar shows your available credits. This updates in real-time as you use resources.

Usage Breakdown

View detailed usage in Billing:

  • GPU Hours: Time each GPU was running
  • Token Usage: Input/output tokens for Token Factory
  • Storage Hours: Persistent storage usage
  • Cost by Resource: Breakdown by instance/service

Transaction History

The Billing page shows:

  • Credits added (payments, vouchers)
  • Credits used (GPU, tokens, storage)
  • Current balance
  • Pending charges

Cost Management Tips

1. Stop Unused Instances

Stopped instances don't incur GPU charges:

StateGPU BillingStorage Billing
RunningActiveActive
StoppedPausedActive (if persistent)
TerminatedStoppedStopped

Use Stop when not actively training or running inference.

2. Right-Size Your GPU

Don't overprovision. Match GPU count to your workload:

  • Start with fewer GPUs
  • Scale up if needed using the Scale feature
  • Use the GPU sizing guides in documentation

3. Use Ephemeral Storage When Possible

Persistent storage incurs ongoing charges. Use it only for:

  • Training checkpoints you need to keep
  • Datasets you'll reuse
  • Model weights you're iterating on

4. Use Batch Processing

For non-real-time workloads, use Token Factory batch processing:

  • 50% cheaper than real-time
  • Perfect for data processing, evaluations, bulk generation
  • Results delivered within 1 or 24 hours

5. Terminate When Done

When you're finished with a project, terminate the instance to stop all charges (including storage).

Warning

Terminating an instance deletes all data permanently. Make sure to save important files first.

Billing Cycle

How Charges Accumulate

  1. Start instance: Billing begins immediately
  2. Running: Charges accumulate per minute
  3. Stop instance: GPU billing pauses immediately
  4. Terminate: All charges stop

Proration

Charges are prorated by the minute. If you run a GPU for 30 minutes, you're charged for 30 minutes (not a full hour).

Real-Time Deduction

Credits are deducted from your balance in real-time. Your balance updates every few minutes to reflect current usage.

Low Balance Warnings

Keep your balance healthy to avoid interruptions:

  • $10 warning: Consider adding funds soon
  • $5 warning: Add funds to avoid interruption
  • $0 balance: Running instances may be paused

We'll email you when your balance is low. Add funds promptly to keep your workloads running.

FAQ

When am I charged?

Billing starts when you launch a GPU or make Token Factory API calls. Charges are deducted from your prepaid balance in real-time.

What happens if I run out of credits?

Running instances may be paused. Add funds to resume immediately. Your data is preserved for a grace period.

Do stopped instances cost money?

GPU charges stop when you stop an instance. Persistent storage continues to be billed while the volume exists.

How do I see my usage history?

Go to Billing to view your usage breakdown, transaction history, and costs by resource.

Can I get a refund?

Unused prepaid credits don't expire and can be refunded within 30 days of purchase. Contact support for refund requests.

Do you offer enterprise pricing?

Yes! Contact enterprise@packet.ai for volume discounts, reserved capacity, and custom SLAs.

Need Help?

For billing questions, contact support@packet.ai