{"id":792,"date":"2026-06-06T01:59:16","date_gmt":"2026-06-06T01:59:16","guid":{"rendered":"https:\/\/convly.ai\/what-is-ollama-complete-guide-2026\/"},"modified":"2026-06-19T16:39:52","modified_gmt":"2026-06-19T16:39:52","slug":"what-is-ollama-complete-guide-2026","status":"publish","type":"post","link":"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/","title":{"rendered":"Cos'\u00e8 Ollama? La guida completa all'esecuzione locale di LLM nel 2026"},"content":{"rendered":"<p>If you&#8217;ve spent any time around local AI in the last two years, you&#8217;ve heard the name. Ollama is the tool that turned &#8220;run a large language model on your own machine&#8221; from a weekend of CUDA errors into a single command: <code>ollama run llama3.3<\/code>.<\/p>\n<p>This guide explains exactly what Ollama is, how it works under the hood, what it can and can&#8217;t do, and whether it&#8217;s the right tool for you in 2026.<\/p>\n<div class=\"convly-tldr\">\n<h3>Punti chiave<\/h3>\n<ul>\n<li><strong>What it is:<\/strong> a free, open-source tool that downloads, manages, and runs open LLMs locally with one command \u2014 no cloud, no API keys, no data leaving your machine.<\/li>\n<li><strong>How it works:<\/strong> it wraps the <code>llama.cpp<\/code> engine (and Apple&#8217;s MLX on Mac since v0.19) and handles model downloads, quantization, GPU allocation, and a REST API on port <code>11434<\/code>.<\/li>\n<li><strong>Who it&#8217;s for:<\/strong> developers and tinkerers who want the lowest-friction way to prototype with local models. It&#8217;s the &#8220;lowest regret&#8221; entry point in 2026.<\/li>\n<li><strong>Who it isn&#8217;t for:<\/strong> high-concurrency production serving \u2014 for that, <a href=\"https:\/\/convly.ai\/it\/ollama-vs-lm-studio-vs-vllm-vs-llama-cpp-2026\/\">vLLM is roughly 16\u201320\u00d7 faster under load<\/a>.<\/li>\n<li><strong>Cost:<\/strong> $0. It&#8217;s MIT-licensed and runs entirely on your hardware.<\/li>\n<\/ul>\n<\/div>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-flat ez-toc-counter ez-toc-container-direction\">\n<label for=\"ez-toc-cssicon-toggle-item-6a38a8d368107\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Attiva\/Disattiva<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #000000;color:#000000\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #000000;color:#000000\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a38a8d368107\"  aria-label=\"Attiva\/Disattiva\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/#What_Ollama_actually_is\" >What Ollama actually is<\/a><\/li><li class='ez-toc-page-1'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/#How_it_works_under_the_hood\" >How it works under the hood<\/a><\/li><li class='ez-toc-page-1'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/#What_you_can_build_with_it\" >What you can build with it<\/a><\/li><li class='ez-toc-page-1'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/#Where_Ollama_fits_among_the_alternatives\" >Where Ollama fits among the alternatives<\/a><\/li><li class='ez-toc-page-1'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/#Getting_started_in_two_minutes\" >Getting started in two minutes<\/a><\/li><li class='ez-toc-page-1'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/#What_hardware_do_you_actually_need\" >What hardware do you actually need?<\/a><\/li><li class='ez-toc-page-1'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/#FAQ\" >Domande frequenti<\/a><\/li><li class='ez-toc-page-1'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/#Bottom_line\" >Conclusione<\/a><\/li><li class='ez-toc-page-1'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/convly.ai\/it\/what-is-ollama-complete-guide-2026\/#Related_articles\" >Articoli correlati<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"What_Ollama_actually_is\"><\/span>What Ollama actually is<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Ollama is an open-source runtime for large language models that runs on your own computer \u2014 Mac, Windows, or Linux. Think of it as the &#8220;Docker for LLMs&#8221;: instead of wrestling with Python environments, model weights, and GPU drivers, you type one command and a model is running.<\/p>\n<p>The pitch is simple: <strong>keep your data on your machine, pay nothing per token, and work offline.<\/strong> When you run <code>ollama run gemma4<\/code>, Ollama downloads the model, loads it into your GPU&#8217;s memory (or system RAM if you don&#8217;t have a GPU), and drops you into a chat prompt. That&#8217;s it.<\/p>\n<p>Behind that simplicity, Ollama is doing a lot of work for you:<\/p>\n<ul>\n<li><strong>Model management<\/strong> \u2014 pulling, versioning, and storing models from its registry, the way a package manager handles software.<\/li>\n<li><strong>Quantization<\/strong> \u2014 automatically using compressed (GGUF) versions of models so a 27-billion-parameter model fits in consumer memory.<\/li>\n<li><strong>GPU layer allocation<\/strong> \u2014 deciding how much of the model lives on your GPU versus CPU, based on the VRAM you have.<\/li>\n<li><strong>Context and KV-cache management<\/strong> \u2014 handling the memory that grows as a conversation gets longer.<\/li>\n<li><strong>A REST API<\/strong> \u2014 exposing everything on <code>http:\/\/localhost:11434<\/code> so your own apps can talk to it.<\/li>\n<\/ul>\n<h2><span class=\"ez-toc-section\" id=\"How_it_works_under_the_hood\"><\/span>How it works under the hood<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Ollama is not itself an inference engine. It&#8217;s an <strong>experience layer<\/strong> wrapped around one. Under the hood it uses <code>llama.cpp<\/code>, the C++ engine that does the actual math of running a quantized model efficiently on CPUs and GPUs. As of v0.19 (March 2026), Ollama also uses <strong>Apple&#8217;s MLX backend<\/strong> on Apple Silicon \u2014 a change that delivered enormous speedups (on an M5 Max running Qwen 3.5, decode throughput nearly doubled).<\/p>\n<p>The workflow looks like this:<\/p>\n<ol>\n<li><strong>You run a command<\/strong> \u2014 <code>ollama run qwen3<\/code> from the terminal, or a request to the API.<\/li>\n<li><strong>Ollama resolves the model<\/strong> \u2014 if it isn&#8217;t already downloaded, it pulls the GGUF weights from the registry.<\/li>\n<li><strong>It loads the model into memory<\/strong> \u2014 splitting layers between GPU and CPU based on available VRAM.<\/li>\n<li><strong>It serves responses<\/strong> \u2014 either interactively in your terminal or as JSON over the REST API.<\/li>\n<\/ol>\n<p>That REST API is the part developers care about most. Any app that can make an HTTP request can use a local model through Ollama \u2014 and because Ollama added an OpenAI-compatible endpoint, a lot of existing code works by just changing the base URL.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"What_you_can_build_with_it\"><\/span>What you can build with it<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Ollama is the engine behind a huge range of local-AI projects in 2026:<\/p>\n<ul>\n<li><strong>Private chatbots<\/strong> that never send a word to the cloud.<\/li>\n<li><strong>Coding assistants<\/strong> \u2014 the newer <code>ollama launch<\/code> command wires up tools like Claude Code, OpenCode, and Codex to a local or cloud model with no config files.<\/li>\n<li><strong>RAG systems<\/strong> using Ollama&#8217;s batch embedding API to index your own documents.<\/li>\n<li><strong>Agents and automations<\/strong> that call local models for classification, extraction, or summarization at zero marginal cost.<\/li>\n<li><strong>Structured-output pipelines<\/strong> \u2014 Ollama can now constrain a model&#8217;s output to a JSON schema, which makes it reliable for programmatic use.<\/li>\n<\/ul>\n<h2><span class=\"ez-toc-section\" id=\"Where_Ollama_fits_among_the_alternatives\"><\/span>Where Ollama fits among the alternatives<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Ollama isn&#8217;t the only way to run models locally, and it isn&#8217;t always the best. Here&#8217;s the honest landscape:<\/p>\n<table class=\"convly-vs\">\n<thead>\n<tr>\n<th>Strumento<\/th>\n<th>Ideale per<\/th>\n<th>Trade-off<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>Ollama<\/strong><\/td>\n<td>One-developer prototyping on any OS<\/td>\n<td>Slow under heavy concurrency<\/td>\n<\/tr>\n<tr>\n<td>LM Studio<\/td>\n<td>A polished GUI to browse and chat with models<\/td>\n<td>Less scriptable; desktop-first<\/td>\n<\/tr>\n<tr>\n<td>vLLM<\/td>\n<td>Multi-user production serving on GPUs<\/td>\n<td>Complex setup; not local-first<\/td>\n<\/tr>\n<tr>\n<td>llama.cpp<\/td>\n<td>Maximum speed and embedded\/edge hardware<\/td>\n<td>Lowest-level; you assemble it yourself<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>If you&#8217;re one person experimenting, Ollama wins on sheer convenience. The moment you need to serve many users at once, you&#8217;ll want to read our full breakdown of <a href=\"https:\/\/convly.ai\/it\/ollama-vs-lm-studio-vs-vllm-vs-llama-cpp-2026\/\">Ollama vs LM Studio vs vLLM vs llama.cpp<\/a>.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Getting_started_in_two_minutes\"><\/span>Getting started in two minutes<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The barrier to entry is genuinely tiny:<\/p>\n<ol>\n<li><strong>Install it<\/strong> \u2014 download the app for your OS (see our <a href=\"https:\/\/convly.ai\/it\/how-to-install-ollama-2026\/\">step-by-step install guide<\/a>).<\/li>\n<li><strong>Pull and run a model<\/strong> \u2014 <code>ollama run gemma4<\/code> for a strong all-rounder, or <code>ollama run qwen3<\/code> for coding.<\/li>\n<li><strong>Talk to it<\/strong> \u2014 chat in the terminal, or point your app at <code>http:\/\/localhost:11434<\/code>.<\/li>\n<\/ol>\n<p>Before you pick a model, check that your machine can handle it \u2014 our guide to <a href=\"https:\/\/convly.ai\/it\/ollama-system-requirements-2026\/\">Ollama&#8217;s system requirements<\/a> maps model sizes to the RAM and VRAM you actually need.<\/p>\n<p><!--ai-enriched--><\/p>\n<h2><span class=\"ez-toc-section\" id=\"What_hardware_do_you_actually_need\"><\/span>What hardware do you actually need?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Ollama will start on almost any machine with a CPU and 8&nbsp;GB of RAM, but &#8220;starts&#8221; and &#8220;feels usable&#8221; are different questions. The single number that decides your experience is how much memory the model fits into, because the entire model has to sit in RAM (or, ideally, GPU VRAM) while it runs. A reliable rule of thumb is roughly <strong>0.6&nbsp;GB of memory per billion parameters<\/strong> at the default Q4_K_M quantization, plus a little headroom for context.<\/p>\n<p>That math gives you a quick sizing guide for the most common model classes:<\/p>\n<table class=\"convly-vs\">\n<tr>\n<th>Model class<\/th>\n<th>Approx. download (Q4_K_M)<\/th>\n<th>Comfortable memory<\/th>\n<\/tr>\n<tr>\n<td>7\u20138B (Llama&nbsp;3.x, Mistral)<\/td>\n<td>~5&nbsp;GB<\/td>\n<td>8&nbsp;GB+<\/td>\n<\/tr>\n<tr>\n<td>13\u201314B (Qwen, Phi)<\/td>\n<td>~9&nbsp;GB<\/td>\n<td>16&nbsp;GB+<\/td>\n<\/tr>\n<tr>\n<td>32B<\/td>\n<td>~20&nbsp;GB<\/td>\n<td>24&nbsp;GB+<\/td>\n<\/tr>\n<tr>\n<td>70B (Llama&nbsp;3.3)<\/td>\n<td>~43&nbsp;GB<\/td>\n<td>64&nbsp;GB+<\/td>\n<\/tr>\n<\/table>\n<p>For most people the practical sweet spot is a GPU or Mac with around <strong>16&nbsp;GB of VRAM or unified memory<\/strong> \u2014 enough to run 7B\u201314B models at speeds that feel instant. A 16&nbsp;GB RTX-class card or a 16&nbsp;GB Apple Silicon Mac both land squarely in this zone.<\/p>\n<p>Two architectural notes matter when you choose. A discrete NVIDIA GPU wins decisively whenever the model fits inside its VRAM, delivering the fastest tokens per second. Apple Silicon&#8217;s <strong>unified memory<\/strong> is the opposite trade-off: it shares all system RAM with the GPU, so a 64&nbsp;GB or 128&nbsp;GB Mac can run 32B\u201370B models that simply won&#8217;t load on a consumer graphics card \u2014 just at lower throughput. The crossover sits around the 24&nbsp;GB model mark.<\/p>\n<p>You <em>pu\u00f2<\/em> run Ollama with no GPU at all. A modern multi-core CPU handles a 7B model at a workable few-to-low-double-digit tokens per second, but large 70B models on CPU drop below one token per second \u2014 fine for overnight batch jobs, painful for chat. If interactive speed matters, GPU or Apple Silicon acceleration is the deciding factor.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"FAQ\"><\/span>Domande frequenti<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<h3>Is Ollama free?<\/h3>\n<p>Yes. Ollama is open-source under the MIT license and completely free. The only &#8220;cost&#8221; is the hardware you run it on and the electricity it uses \u2014 there are no per-token charges because nothing goes to a cloud provider.<\/p>\n<h3>Does Ollama send my data anywhere?<\/h3>\n<p>No. By design, inference happens entirely on your machine. The only network traffic is downloading a model the first time you pull it. This is the main reason teams in healthcare, legal, and finance use it \u2014 sensitive prompts never leave the building.<\/p>\n<h3>Do I need a GPU to run Ollama?<\/h3>\n<p>No, but it helps a lot. Ollama runs on CPU alone for smaller models (a 2\u20133B model is comfortable on a modern laptop), and uses your GPU automatically when one is available. For models above ~13B parameters, a GPU or Apple Silicon with unified memory makes a big difference. See our <a href=\"https:\/\/convly.ai\/it\/ollama-system-requirements-2026\/\">system requirements guide<\/a> for specifics.<\/p>\n<h3>What models can Ollama run?<\/h3>\n<p>Over 100 open models, including Meta&#8217;s Llama 3.3 and Llama 4, Google&#8217;s Gemma 4, Alibaba&#8217;s Qwen 3 series, DeepSeek V3 and R1, Mistral, and Microsoft&#8217;s Phi-4. Our pick of the <a href=\"https:\/\/convly.ai\/it\/best-local-llms-to-run-on-ollama-2026\/\">migliori LLM locali da eseguire su Ollama<\/a> breaks down which to use for which job.<\/p>\n<h3>Is Ollama better than ChatGPT?<\/h3>\n<p>Different tools. ChatGPT gives you a frontier model with no setup but sends your data to the cloud and charges a subscription. Ollama runs smaller open models locally, free and private, but a top local model still trails the very best cloud models on the hardest tasks. For privacy, cost, and offline use, Ollama wins; for raw capability on complex reasoning, the cloud frontier is still ahead.<\/p>\n<h3>What is the Ollama API port?<\/h3>\n<p>Ollama exposes its REST API on <code>http:\/\/localhost:11434<\/code> by default. It also offers an OpenAI-compatible endpoint, so a lot of existing OpenAI-SDK code works by simply pointing the base URL at your local Ollama instance.<\/p>\n<h3>Can Ollama replace the OpenAI API in my existing app?<\/h3>\n<p>For most apps, yes. Ollama exposes an OpenAI-compatible endpoint at <strong>http:\/\/localhost:11434\/v1<\/strong>, including the <code>\/v1\/chat\/completions<\/code> route that most tools call. Point your OpenAI client&#8217;s <code>base_url<\/code> at it, pass any placeholder API key, and set the model field to an installed Ollama tag. Embeddings, vision, and tool-calling are supported too, so many projects switch by changing two lines. It covers parts of the OpenAI API rather than every parameter, so verify any exotic fields your app relies on.<\/p>\n<h3>Can I run Ollama without a GPU?<\/h3>\n<p>Yes. Ollama runs entirely on CPU when no compatible GPU is present \u2014 you just need enough system RAM to hold the model. A current multi-core CPU runs a 7B model at usable speeds, but throughput falls off sharply as models grow, and 70B-class models on CPU are too slow for interactive use. For day-to-day chat, a GPU or Apple Silicon Mac makes the difference between sluggish and snappy.<\/p>\n<h3>How much disk space do Ollama models take, and where are they stored?<\/h3>\n<p>Plan for the download sizes above: a 7B model is roughly 5&nbsp;GB on disk, a 70B model around 43&nbsp;GB, and pulling several models adds up quickly. By default they live under <code>~\/.ollama\/models<\/code> (o <code>C:Users&lt;you&gt;.ollamamodels<\/code> on Windows). You can relocate that directory with the <code>OLLAMA_MODELS<\/code> environment variable, and remove anything you no longer need with <code>ollama rm &lt;model&gt;<\/code>.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Bottom_line\"><\/span>Conclusione<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Ollama won the local-LLM space in 2026 by doing one thing extremely well: removing friction. It&#8217;s free, private, runs on hardware you already own, and gets you from &#8220;I want to try a local model&#8221; to a running model in about two minutes. It isn&#8217;t the fastest option under heavy load, and a local model won&#8217;t beat the best cloud frontier on the hardest problems \u2014 but as the on-ramp to local AI, nothing else comes close. If you&#8217;re starting out, start here.<\/p>\n<p><!--related-block--><\/p>\n<div class=\"convly-related\">\n<h2><span class=\"ez-toc-section\" id=\"Related_articles\"><\/span>Articoli correlati<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<ul>\n<li><a href=\"https:\/\/convly.ai\/it\/gpt-5-6-what-we-know-2026\/\">GPT-5.6: ci\u00f2 che sappiamo rispetto a ci\u00f2 che \u00e8 trapelato (2026)<\/a><\/li>\n<li><a href=\"https:\/\/convly.ai\/it\/kimi-k2-7-code-explained-2026\/\">Kimi K2.7 Code spiegato: il modello aperto per la programmazione da 1 trilione di token di Moonshot<\/a><\/li>\n<li><a href=\"https:\/\/convly.ai\/it\/glm-5-2-explained-2026\/\">GLM 5.2 spiegato: il modello aperto per la programmazione con contesto da 1 milione di token di Zhipu<\/a><\/li>\n<li><a href=\"https:\/\/convly.ai\/it\/ollama-vs-jan-2026\/\">Ollama vs Jan: quale applicazione locale per IA vince nel 2026?<\/a><\/li>\n<li><a href=\"https:\/\/convly.ai\/it\/lm-studio-complete-guide-2026\/\">LM Studio: guida completa (2026)<\/a><\/li>\n<li><a href=\"https:\/\/convly.ai\/it\/claude-5-new-ai-models-june-2026\/\">Esiste un Claude 5? Claude Fable 5 e tutti i principali modelli AI di giugno 2026<\/a><\/li>\n<li><a href=\"https:\/\/convly.ai\/it\/llm-hallucinations-complete-guide\/\">LLM Hallucinations in 2026: Why They Happen and How to Stop Them<\/a><\/li>\n<li><a href=\"https:\/\/convly.ai\/it\/prompt-engineering-techniques\/\">Prompt Engineering in 2026: 12 Techniques That Actually Work<\/a><\/li>\n<li><a href=\"https:\/\/convly.ai\/it\/ollama-vs-lm-studio-vs-vllm-vs-llama-cpp-2026\/\">Ollama vs LM Studio vs vLLM vs llama.cpp: quale scegliere nel 2026?<\/a><\/li>\n<\/ul>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>Ollama turned running a local LLM from a weekend project into a single command. Here&#8217;s exactly what it is, how it works under the hood, and why it became the default in 2026.<\/p>","protected":false},"author":1,"featured_media":798,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[3],"tags":[650,256,259,423,649,651],"class_list":["post-792","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-llms","tag-llama-cpp","tag-local-llm","tag-ollama","tag-open-source-ai","tag-run-llm-locally","tag-self-hosted-ai"],"_links":{"self":[{"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/posts\/792","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/comments?post=792"}],"version-history":[{"count":5,"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/posts\/792\/revisions"}],"predecessor-version":[{"id":1201,"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/posts\/792\/revisions\/1201"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/media\/798"}],"wp:attachment":[{"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/media?parent=792"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/categories?post=792"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/convly.ai\/it\/wp-json\/wp\/v2\/tags?post=792"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}