{"id":39,"date":"2026-05-18T12:37:25","date_gmt":"2026-05-18T12:37:25","guid":{"rendered":"https:\/\/convly.ai\/build-first-machine-learning-model-python\/"},"modified":"2026-05-21T20:33:14","modified_gmt":"2026-05-21T20:33:14","slug":"build-first-machine-learning-model-python","status":"publish","type":"post","link":"https:\/\/convly.ai\/fr\/build-first-machine-learning-model-python\/","title":{"rendered":"How to Build Your First Machine Learning Model in Python (2026)"},"content":{"rendered":"<p>The best way to understand machine learning is to build a model yourself. It&#8217;s far less intimidating than it sounds \u2014 with Python and the right library, your first working model is about 20 lines of code. This tutorial walks through every step, explaining not just <em>what<\/em> to type but <em>why<\/em>.<\/p>\n<div class=\"convly-tldr\">\n<h3>Key takeaways<\/h3>\n<ul>\n<li><strong>You&#8217;ll use<\/strong> Python and scikit-learn \u2014 the standard beginner-friendly ML library.<\/li>\n<li><strong>The workflow:<\/strong> load data \u2192 split it \u2192 train a model \u2192 evaluate \u2192 predict.<\/li>\n<li><strong>The golden rule:<\/strong> always test on data the model never saw during training.<\/li>\n<li><strong>No advanced math needed<\/strong> \u2014 scikit-learn handles the hard parts.<\/li>\n<\/ul>\n<\/div>\n<h2>What you&#8217;ll build<\/h2>\n<p>You&#8217;ll build a <strong>classifier<\/strong> \u2014 a model that sorts things into categories. We&#8217;ll use the classic beginner dataset, the <strong>Iris dataset<\/strong>: measurements of iris flowers (petal and sepal length and width), where the task is to predict the flower&#8217;s species. It&#8217;s small, clean, and built into scikit-learn, so it&#8217;s perfect for a first model.<\/p>\n<p>The same five steps you learn here apply to almost every machine learning project, no matter how large.<\/p>\n<h2>Step 1: Set up your tools<\/h2>\n<p>You need Python and two libraries. <strong>scikit-learn<\/strong> is the workhorse \u2014 it provides datasets, algorithms, and evaluation tools in a consistent, beginner-friendly interface.<\/p>\n<p>Install them from your terminal:<\/p>\n<pre><code>pip install scikit-learn pandas\n<\/code><\/pre>\n<p>You can write the code in a plain <code>.py<\/code> file, but a <strong>Jupyter notebook<\/strong> (or a free cloud notebook like Google Colab) is ideal for learning \u2014 you run code in small pieces and see each result immediately.<\/p>\n<h2>Step 2: Load the data<\/h2>\n<p>Every ML project starts with data. Here we load the built-in Iris dataset:<\/p>\n<pre><code class=\"language-python\">from sklearn.datasets import load_iris\n\niris = load_iris()\nX = iris.data      # the measurements (the inputs \/ features)\ny = iris.target    # the species (the labels \/ answers)\n\nprint(&quot;Shape of X:&quot;, X.shape)   # (150, 4) \u2014 150 flowers, 4 measurements each\nprint(&quot;Classes:&quot;, iris.target_names)\n<\/code><\/pre>\n<p>Two variables matter here, and the naming is a universal convention:<\/p>\n<ul>\n<li><strong><code>X<\/code><\/strong> holds the <strong>features<\/strong> \u2014 the inputs the model learns from (the four measurements).<\/li>\n<li><strong><code>y<\/code><\/strong> holds the <strong>labels<\/strong> \u2014 the correct answers (the species).<\/li>\n<\/ul>\n<p>Because we have the answers, this is <a href=\"\/supervised-vs-unsupervised-vs-reinforcement-learning\/\">supervised learning<\/a>.<\/p>\n<h2>Step 3: Split the data<\/h2>\n<p>This is the most important step for getting an honest result. You must split your data into two parts:<\/p>\n<ul>\n<li>A <strong>training set<\/strong> the model learns from.<\/li>\n<li>A <strong>test set<\/strong> the model never sees during training \u2014 used only to evaluate it.<\/li>\n<\/ul>\n<p>If you tested on the same data you trained on, you&#8217;d just be measuring memorization, not real learning. (This is how you catch <a href=\"\/overfitting-how-to-prevent-it\/\">overfitting<\/a>.)<\/p>\n<pre><code class=\"language-python\">from sklearn.model_selection import train_test_split\n\nX_train, X_test, y_train, y_test = train_test_split(\n    X, y, test_size=0.2, random_state=42\n)\n<\/code><\/pre>\n<p><code>test_size=0.2<\/code> keeps 20% of the data for testing and trains on the other 80%. <code>random_state=42<\/code> just makes the random split reproducible, so you get the same result every run.<\/p>\n<h2>Step 4: Choose and train a model<\/h2>\n<p>Now the machine learning itself. We&#8217;ll use a <strong>Random Forest<\/strong> \u2014 an accurate, reliable, beginner-friendly algorithm (see our <a href=\"\/top-10-machine-learning-algorithms\/\">algorithms guide<\/a>).<\/p>\n<p>In scikit-learn, training a model is two lines:<\/p>\n<pre><code class=\"language-python\">from sklearn.ensemble import RandomForestClassifier\n\nmodel = RandomForestClassifier(random_state=42)\nmodel.fit(X_train, y_train)\n<\/code><\/pre>\n<p>That <code>.fit()<\/code> call <strong>is<\/strong> the training. The model studies the training features and their labels and learns the patterns that connect measurements to species. scikit-learn handles all the math behind that single line.<\/p>\n<h2>Step 5: Evaluate the model<\/h2>\n<p>Now check how well it learned \u2014 using the test set it has never seen:<\/p>\n<pre><code class=\"language-python\">from sklearn.metrics import accuracy_score\n\npredictions = model.predict(X_test)\naccuracy = accuracy_score(y_test, predictions)\n\nprint(f&quot;Accuracy: {accuracy:.2%}&quot;)\n<\/code><\/pre>\n<p><code>.predict()<\/code> asks the model to classify the test flowers; <code>accuracy_score<\/code> compares its guesses to the true answers. On the Iris dataset you&#8217;ll typically see accuracy around 95\u2013100% \u2014 your model correctly identifies almost every flower it never saw before.<\/p>\n<h2>Step 6: Make a prediction on new data<\/h2>\n<p>The real payoff: using the model on brand-new input. Give it a set of measurements and it predicts the species:<\/p>\n<pre><code class=\"language-python\">new_flower = [[5.1, 3.5, 1.4, 0.2]]   # sepal &amp; petal measurements\nprediction = model.predict(new_flower)\n\nprint(&quot;Predicted species:&quot;, iris.target_names[prediction[0]])\n<\/code><\/pre>\n<p>That&#8217;s a complete machine learning model: trained, tested, and making predictions on data it has never encountered.<\/p>\n<h2>The complete workflow<\/h2>\n<p>Those five steps are not just an exercise \u2014 they&#8217;re the skeleton of essentially every supervised ML project:<\/p>\n<table class=\"convly-vs\">\n<thead>\n<tr>\n<th>Step<\/th>\n<th>What it does<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1. Load data<\/td>\n<td>Get features (X) and labels (y)<\/td>\n<\/tr>\n<tr>\n<td>2. Split data<\/td>\n<td>Separate training and test sets<\/td>\n<\/tr>\n<tr>\n<td>3. Train<\/td>\n<td><code>model.fit()<\/code> learns the pattern<\/td>\n<\/tr>\n<tr>\n<td>4. Evaluate<\/td>\n<td>Measure accuracy on unseen test data<\/td>\n<\/tr>\n<tr>\n<td>5. Predict<\/td>\n<td><code>model.predict()<\/code> on new inputs<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Bigger projects add data cleaning, feature preparation, and model tuning \u2014 but this core loop stays the same.<\/p>\n<h2>Where to go next<\/h2>\n<p>To keep building:<\/p>\n<ul>\n<li><strong>Try other algorithms<\/strong> \u2014 swap <code>RandomForestClassifier<\/code> for <code>LogisticRegression<\/code> or <code>SVC<\/code> and compare. scikit-learn&#8217;s consistent interface makes this trivial.<\/li>\n<li><strong>Try other datasets<\/strong> \u2014 practice on <a href=\"\/best-free-datasets-machine-learning\/\">free datasets<\/a> that interest you.<\/li>\n<li><strong>Learn data preparation<\/strong> \u2014 real data is messy; cleaning and preparing it is most of the job.<\/li>\n<li><strong>Explore evaluation<\/strong> \u2014 accuracy is just one metric; learn precision, recall, and cross-validation.<\/li>\n<\/ul>\n<h2>FAQ<\/h2>\n<h3>How do I build a machine learning model in Python?<\/h3>\n<p>Use the scikit-learn library. The workflow is: load your data into features (X) and labels (y), split it into training and test sets, create a model and call <code>.fit()<\/code> to train it, evaluate it on the test set, and use <code>.predict()<\/code> for new data. A first model is about 20 lines of code.<\/p>\n<h3>What library should beginners use for machine learning?<\/h3>\n<p>scikit-learn. It offers a wide range of algorithms, built-in datasets, and evaluation tools through one simple, consistent interface, and it handles the underlying math for you. It&#8217;s the standard starting point before moving to deep learning frameworks.<\/p>\n<h3>Do I need to be good at math to build an ML model?<\/h3>\n<p>No. To build models with scikit-learn you need only basic Python and an understanding of the workflow. The library handles the math. Deeper math becomes useful later if you want to tune models expertly or do research.<\/p>\n<h3>Why do I need to split data into training and test sets?<\/h3>\n<p>So you can measure real performance. If you test a model on the same data it trained on, you only measure memorization. A separate test set the model never saw shows whether it genuinely learned the pattern and can generalize to new data.<\/p>\n<h3>What does model.fit() do?<\/h3>\n<p><code>.fit()<\/code> is the training step. It feeds the training features and labels to the algorithm, which adjusts its internal parameters to learn the patterns connecting inputs to correct answers. After <code>.fit()<\/code>, the model is trained and ready to make predictions.<\/p>\n<h2>Bottom line<\/h2>\n<p>Building your first machine learning model is genuinely a short, achievable project: install scikit-learn, then load, split, train, evaluate, and predict. Those five steps are the foundation of nearly every supervised ML project you&#8217;ll ever build.<\/p>\n<p>Don&#8217;t just read this \u2014 open a notebook and run the code. Change the algorithm, try a different dataset, break things and fix them. The concepts in machine learning click far faster once you&#8217;ve trained a model with your own hands. When you&#8217;re ready for more, grab a <a href=\"\/best-free-datasets-machine-learning\/\">free dataset<\/a> and build something of your own.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Build a working machine learning model in Python, step by step. This beginner tutorial uses scikit-learn to take you from setup to a trained, tested model.<\/p>","protected":false},"author":0,"featured_media":40,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_themeisle_gutenberg_block_has_review":false,"footnotes":""},"categories":[9],"tags":[486,484,476,487,485],"class_list":["post-39","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tutorials","tag-first-ml-model","tag-machine-learning-python","tag-ml-for-beginners","tag-python-tutorial","tag-scikit-learn"],"uagb_featured_image_src":{"full":["https:\/\/convly.ai\/wp-content\/uploads\/2026\/05\/build-first-machine-learning-model-python.jpg",1200,630,false],"thumbnail":["https:\/\/convly.ai\/wp-content\/uploads\/2026\/05\/build-first-machine-learning-model-python-150x150.jpg",150,150,true],"medium":["https:\/\/convly.ai\/wp-content\/uploads\/2026\/05\/build-first-machine-learning-model-python-300x158.jpg",300,158,true],"medium_large":["https:\/\/convly.ai\/wp-content\/uploads\/2026\/05\/build-first-machine-learning-model-python-768x403.jpg",768,403,true],"large":["https:\/\/convly.ai\/wp-content\/uploads\/2026\/05\/build-first-machine-learning-model-python-1024x538.jpg",1024,538,true],"1536x1536":["https:\/\/convly.ai\/wp-content\/uploads\/2026\/05\/build-first-machine-learning-model-python.jpg",1200,630,false],"2048x2048":["https:\/\/convly.ai\/wp-content\/uploads\/2026\/05\/build-first-machine-learning-model-python.jpg",1200,630,false],"trp-custom-language-flag":["https:\/\/convly.ai\/wp-content\/uploads\/2026\/05\/build-first-machine-learning-model-python-18x9.jpg",18,9,true]},"uagb_author_info":{"display_name":"","author_link":"https:\/\/convly.ai\/fr\/author\/"},"uagb_comment_info":0,"uagb_excerpt":"Build a working machine learning model in Python, step by step. This beginner tutorial uses scikit-learn to take you from setup to a trained, tested model.","_links":{"self":[{"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/posts\/39","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/comments?post=39"}],"version-history":[{"count":1,"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/posts\/39\/revisions"}],"predecessor-version":[{"id":706,"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/posts\/39\/revisions\/706"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/media\/40"}],"wp:attachment":[{"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/media?parent=39"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/categories?post=39"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/convly.ai\/fr\/wp-json\/wp\/v2\/tags?post=39"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}