{"id":1314,"date":"2026-06-26T15:58:15","date_gmt":"2026-06-26T15:58:15","guid":{"rendered":"https:\/\/convly.ai\/self-hosting-vs-api-calculator\/"},"modified":"2026-06-26T15:58:15","modified_gmt":"2026-06-26T15:58:15","slug":"self-hosting-vs-api-calculator","status":"publish","type":"page","link":"https:\/\/convly.ai\/pt\/self-hosting-vs-api-calculator\/","title":{"rendered":"Auto-hospedagem versus API: Calculadora de Ponto de Equil\u00edbrio de Custos para LLMs"},"content":{"rendered":"<p>Should you buy a GPU and self-host an open LLM, or just keep paying per token for an API? It comes down to volume. Enter your monthly usage and your hardware, and this calculator shows the break-even point \u2014 the moment owning the GPU becomes cheaper than the API bill.<\/p>\n<div class=\"shc\" id=\"shc\">\n  <div class=\"shc-grid\">\n    <div class=\"shc-col\">\n      <h4>Your usage<\/h4>\n      <label>Input tokens \/ month (millions)<input type=\"number\" id=\"shc-in\" value=\"50\" min=\"0\" step=\"1\"><\/label>\n      <label>Output tokens \/ month (millions)<input type=\"number\" id=\"shc-out\" value=\"10\" min=\"0\" step=\"1\"><\/label>\n      <label>API you'd otherwise pay for\n        <select id=\"shc-model\">\n          <option value=\"0\" data-in=\"10\" data-out=\"50\">Claude Fable 5 ($10\/$50)<\/option><option value=\"1\" data-in=\"1\" data-out=\"5\">Claude Haiku 4.5 ($1\/$5)<\/option><option value=\"2\" data-in=\"5\" data-out=\"25\">Claude Opus 4.8 ($5\/$25)<\/option><option value=\"3\" data-in=\"3\" data-out=\"15\">Claude Sonnet 4.6 ($3\/$15)<\/option><option value=\"4\" data-in=\"0.5\" data-out=\"2.15\">DeepSeek R1 ($0.5\/$2.15)<\/option><option value=\"5\" data-in=\"0.8\" data-out=\"0.8\">DeepSeek R1 Distill Llama 70B ($0.8\/$0.8)<\/option><option value=\"6\" data-in=\"0.14\" data-out=\"0.28\">DeepSeek V4-Flash ($0.14\/$0.28)<\/option><option value=\"7\" data-in=\"0.435\" data-out=\"0.87\">DeepSeek V4-Pro ($0.435\/$0.87)<\/option><option value=\"8\" data-in=\"2\" data-out=\"12\">Gemini 3.1 Pro ($2\/$12)<\/option><option value=\"9\" data-in=\"1.5\" data-out=\"9\">Gemini 3.5 Flash ($1.5\/$9)<\/option><option value=\"10\" data-in=\"0.05\" data-out=\"0.15\">Gemma 3 12B ($0.05\/$0.15)<\/option><option value=\"11\" data-in=\"0.08\" data-out=\"0.16\">Gemma 3 27B ($0.08\/$0.16)<\/option><option value=\"12\" data-in=\"0.05\" data-out=\"0.1\">Gemma 3 4B ($0.05\/$0.1)<\/option><option value=\"13\" data-in=\"1.4\" data-out=\"4.4\">GLM 5.2 ($1.4\/$4.4)<\/option><option value=\"14\" data-in=\"5\" data-out=\"30\">GPT-5.5 ($5\/$30)<\/option><option value=\"15\" data-in=\"0.6\" data-out=\"2.5\">Kimi K2.7 Code ($0.6\/$2.5)<\/option><option value=\"16\" data-in=\"0.02\" data-out=\"0.03\">Llama 3.1 8B ($0.02\/$0.03)<\/option><option value=\"17\" data-in=\"0.1\" data-out=\"0.32\">Llama 3.3 70B ($0.1\/$0.32)<\/option><option value=\"18\" data-in=\"0.15\" data-out=\"0.6\">Llama 4 Maverick ($0.15\/$0.6)<\/option><option value=\"19\" data-in=\"0.1\" data-out=\"0.3\">Llama 4 Scout ($0.1\/$0.3)<\/option><option value=\"20\" data-in=\"0.02\" data-out=\"0.03\">Mistral 7B ($0.02\/$0.03)<\/option><option value=\"21\" data-in=\"2\" data-out=\"6\">Mistral Large 3 ($2\/$6)<\/option><option value=\"22\" data-in=\"0.02\" data-out=\"0.04\">Mistral NeMo 12B ($0.02\/$0.04)<\/option><option value=\"23\" data-in=\"0.07\" data-out=\"0.14\">Phi-4 ($0.07\/$0.14)<\/option><option value=\"24\" data-in=\"0.12\" data-out=\"0.24\">Qwen3 14B ($0.12\/$0.24)<\/option><option value=\"25\" data-in=\"0.45\" data-out=\"1.8\">Qwen3 235B-A22B ($0.45\/$1.8)<\/option><option value=\"26\" data-in=\"0.12\" data-out=\"0.5\">Qwen3 30B-A3B ($0.12\/$0.5)<\/option><option value=\"27\" data-in=\"0.08\" data-out=\"0.28\">Qwen3 32B ($0.08\/$0.28)<\/option><option value=\"28\" data-in=\"0.04\" data-out=\"0.14\">Qwen3 8B ($0.04\/$0.14)<\/option>          <option value=\"custom\">Custom blended ($\/1M)\u2026<\/option>\n        <\/select>\n      <\/label>\n      <label id=\"shc-blended-wrap\" style=\"display:none\">Custom blended price ($\/1M)<input type=\"number\" id=\"shc-blended\" value=\"0.50\" min=\"0\" step=\"0.01\"><\/label>\n    <\/div>\n    <div class=\"shc-col\">\n      <h4>Your self-host rig<\/h4>\n      <label>GPU\n        <select id=\"shc-gpu\">\n          <option data-p=\"450\" data-w=\"165\">RTX 4060 Ti 16GB \u2014 $450<\/option>\n          <option data-p=\"1800\" data-w=\"450\" selected>RTX 4090 24GB \u2014 $1,800<\/option>\n          <option data-p=\"2200\" data-w=\"575\">RTX 5090 32GB \u2014 $2,200<\/option>\n          <option data-p=\"6800\" data-w=\"300\">RTX 6000 Ada 48GB \u2014 $6,800<\/option>\n          <option data-p=\"18000\" data-w=\"700\">H100 80GB \u2014 $18,000<\/option>\n          <option value=\"custom\">Custom\u2026<\/option>\n        <\/select>\n      <\/label>\n      <label id=\"shc-gpu-custom\" style=\"display:none\">GPU price ($) \/ power (W)\n        <span style=\"display:flex;gap:8px\"><input type=\"number\" id=\"shc-gpu-price\" value=\"1800\" min=\"0\"><input type=\"number\" id=\"shc-gpu-watts\" value=\"450\" min=\"0\"><\/span>\n      <\/label>\n      <label>Amortize GPU over (months)<input type=\"number\" id=\"shc-amort\" value=\"24\" min=\"1\" step=\"1\"><\/label>\n      <label>Electricity ($\/kWh)<input type=\"number\" id=\"shc-kwh\" value=\"0.15\" min=\"0\" step=\"0.01\"><\/label>\n      <label>Hours\/day GPU active<input type=\"number\" id=\"shc-hours\" value=\"8\" min=\"0\" max=\"24\" step=\"1\"><\/label>\n    <\/div>\n  <\/div>\n\n  <div id=\"shc-verdict\" class=\"shc-verdict\"><\/div>\n  <table class=\"shc-table\">\n    <tbody>\n      <tr><td>API cost (your volume)<\/td><td id=\"shc-api\">\u2014<\/td><\/tr>\n      <tr><td>Self-host cost (GPU amortized)<\/td><td id=\"shc-gpuc\">\u2014<\/td><\/tr>\n      <tr><td>Self-host cost (electricity)<\/td><td id=\"shc-elec\">\u2014<\/td><\/tr>\n      <tr class=\"shc-tot\"><td>Self-host total \/ month<\/td><td id=\"shc-self\">\u2014<\/td><\/tr>\n    <\/tbody>\n  <\/table>\n  <p class=\"shc-note\">Self-hosting runs <strong>open-weight models<\/strong> (free weights), so this compares the per-token API bill against owning hardware. It assumes your GPU can keep up with the volume (a single GPU has a tokens\/sec ceiling) and ignores your setup\/maintenance time. Check what a GPU can actually run in our <a href=\"\/llm-vram-calculator\/\">VRAM calculator<\/a>, and current API prices in the <a href=\"\/ai-api-cost-calculator\/\">cost calculator<\/a>.<\/p>\n<\/div>\n<script>(function(){\n  var $=function(id){return document.getElementById(id);};\n  function val(id){return parseFloat($(id).value)||0;}\n  function gpu(){var s=$('shc-gpu');if(s.value==='custom')return[val('shc-gpu-price'),val('shc-gpu-watts')];var o=s.options[s.selectedIndex];return[parseFloat(o.dataset.p),parseFloat(o.dataset.w)];}\n  function api(){var s=$('shc-model');if(s.value==='custom'){var b=val('shc-blended');return[b,b];}var o=s.options[s.selectedIndex];return[parseFloat(o.dataset.in),parseFloat(o.dataset.out)];}\n  function money(x){return '$'+x.toLocaleString(undefined,{maximumFractionDigits:2});}\n  function render(){\n    var inM=val('shc-in'),outM=val('shc-out'),p=api(),g=gpu();\n    var apiM=inM*p[0]+outM*p[1];\n    var gpuM=g[0]\/Math.max(1,val('shc-amort'));\n    var elecM=(g[1]\/1000)*val('shc-hours')*30*val('shc-kwh');\n    var selfM=gpuM+elecM;\n    $('shc-api').textContent=money(apiM)+'\/mo';\n    $('shc-gpuc').textContent=money(gpuM)+'\/mo';\n    $('shc-elec').textContent=money(elecM)+'\/mo';\n    $('shc-self').textContent=money(selfM)+'\/mo';\n    var v=$('shc-verdict'),save=apiM-selfM;\n    var be=apiM>0?(selfM\/apiM)*(inM+outM):0;\n    if(save>0){v.className='shc-verdict shc-win';v.innerHTML='\u2705 <b>Self-hosting saves you '+money(save)+'\/month<\/b> at this volume.<br><span>Break-even: self-hosting wins above ~<b>'+be.toFixed(1)+'M tokens\/month<\/b> (at your input:output mix).<\/span>';}\n    else{v.className='shc-verdict shc-lose';v.innerHTML='\ud83d\udcb8 <b>The API is cheaper by '+money(-save)+'\/month<\/b> at this volume.<br><span>You\\'d need ~<b>'+be.toFixed(1)+'M tokens\/month<\/b> before buying this GPU pays off.<\/span>';}\n  }\n  $('shc-model').addEventListener('change',function(){$('shc-blended-wrap').style.display=this.value==='custom'?'':'none';render();});\n  $('shc-gpu').addEventListener('change',function(){$('shc-gpu-custom').style.display=this.value==='custom'?'':'none';render();});\n  ['shc-in','shc-out','shc-blended','shc-gpu-price','shc-gpu-watts','shc-amort','shc-kwh','shc-hours'].forEach(function(id){$(id).addEventListener('input',render);});\n  render();\n})();<\/script>\n\n<p>Remember: self-hosting runs <a href=\"\/models\/\">open-weight models<\/a>, so factor in the quality difference versus a frontier API \u2014 and use our <a href=\"\/llm-vram-calculator\/\">VRAM calculator<\/a> to confirm your GPU can actually run the model you want.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Should you buy a GPU and self-host an open LLM, or just keep paying per token for an API? It [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"class_list":["post-1314","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/convly.ai\/pt\/wp-json\/wp\/v2\/pages\/1314","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/convly.ai\/pt\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/convly.ai\/pt\/wp-json\/wp\/v2\/types\/page"}],"replies":[{"embeddable":true,"href":"https:\/\/convly.ai\/pt\/wp-json\/wp\/v2\/comments?post=1314"}],"version-history":[{"count":0,"href":"https:\/\/convly.ai\/pt\/wp-json\/wp\/v2\/pages\/1314\/revisions"}],"wp:attachment":[{"href":"https:\/\/convly.ai\/pt\/wp-json\/wp\/v2\/media?parent=1314"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}