{"id":10841,"date":"2024-03-14T12:52:32","date_gmt":"2024-03-14T12:52:32","guid":{"rendered":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/"},"modified":"2025-08-04T04:37:06","modified_gmt":"2025-08-04T04:37:06","slug":"what-is-the-method-for-creating-a-pytorch-dataset","status":"publish","type":"post","link":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/","title":{"rendered":"Create PyTorch Dataset: Step-by-Step Guide"},"content":{"rendered":"<p>PyTorch offers a class called Dataset that can be used to create custom datasets. To create a dataset, you need to inherit from the Dataset class and implement the methods __len__ and __getitem__.<\/p>\n<p>The __len__ method returns the size of the dataset, which is the number of data samples.<\/p>\n<p>The __getitem__ method returns the corresponding data sample based on the given index. In this method, data files can be read, data can be preprocessed, and the required input and output data for the model can be returned.<\/p>\n<p>Here is a simple example demonstrating how to create a custom dataset class.<\/p>\n<pre class=\"post-pre\"><code><span class=\"hljs-keyword\">import<\/span> torch\r\n<span class=\"hljs-keyword\">from<\/span> torch.utils.data <span class=\"hljs-keyword\">import<\/span> Dataset\r\n\r\n<span class=\"hljs-keyword\">class<\/span> <span class=\"hljs-title class_\">CustomDataset<\/span>(<span class=\"hljs-title class_ inherited__\">Dataset<\/span>):\r\n    <span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title function_\">__init__<\/span>(<span class=\"hljs-params\">self, data<\/span>):\r\n        self.data = data\r\n\r\n    <span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title function_\">__len__<\/span>(<span class=\"hljs-params\">self<\/span>):\r\n        <span class=\"hljs-keyword\">return<\/span> <span class=\"hljs-built_in\">len<\/span>(self.data)\r\n\r\n    <span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title function_\">__getitem__<\/span>(<span class=\"hljs-params\">self, index<\/span>):\r\n        sample = self.data[index]\r\n        <span class=\"hljs-comment\"># \u53ef\u4ee5\u5bf9\u6570\u636e\u8fdb\u884c\u9884\u5904\u7406<\/span>\r\n        input_data = sample[:-<span class=\"hljs-number\">1<\/span>]\r\n        target = sample[-<span class=\"hljs-number\">1<\/span>]\r\n        <span class=\"hljs-keyword\">return<\/span> torch.tensor(input_data), torch.tensor(target)\r\n<\/code><\/pre>\n<p>In the example above, the CustomDataset class takes a list of data as a parameter and implements the methods __len__ and __getitem__. In the __getitem__ method, the data samples are split into input data and target data, and the corresponding tensors are returned.<\/p>\n<p>Once a custom dataset class has been created, the DataLoader class can be used to load the data and iterate through training the model.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>PyTorch offers a class called Dataset that can be used to create custom datasets. To create a dataset, you need to inherit from the Dataset class and implement the methods __len__ and __getitem__. The __len__ method returns the size of the dataset, which is the number of data samples. The __getitem__ method returns the corresponding [&hellip;]<\/p>\n","protected":false},"author":14,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[1],"tags":[13716,5746,960,75,1239],"class_list":["post-10841","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-custom-dataset","tag-dataset","tag-deep-learning","tag-machine-learning","tag-pytorch"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v21.5 (Yoast SEO v21.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Create PyTorch Dataset: Step-by-Step Guide - Blog - Silicon Cloud<\/title>\n<meta name=\"description\" content=\"Learn to create custom PyTorch datasets using Dataset class. Implement __len__ and __getitem__ methods for efficient data loading.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Create PyTorch Dataset: Step-by-Step Guide\" \/>\n<meta property=\"og:description\" content=\"Learn to create custom PyTorch datasets using Dataset class. Implement __len__ and __getitem__ methods for efficient data loading.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog - Silicon Cloud\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-14T12:52:32+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-08-04T04:37:06+00:00\" \/>\n<meta name=\"author\" content=\"Noah Thompson\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:site\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Noah Thompson\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/\"},\"author\":{\"name\":\"Noah Thompson\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/2e83cc6ab9f60d36921c2d0f9f280f4a\"},\"headline\":\"Create PyTorch Dataset: Step-by-Step Guide\",\"datePublished\":\"2024-03-14T12:52:32+00:00\",\"dateModified\":\"2025-08-04T04:37:06+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/\"},\"wordCount\":172,\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"keywords\":[\"custom dataset\",\"Dataset\",\"Deep Learning\",\"machine learning\",\"PyTorch\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/\",\"name\":\"Create PyTorch Dataset: Step-by-Step Guide - Blog - Silicon Cloud\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\"},\"datePublished\":\"2024-03-14T12:52:32+00:00\",\"dateModified\":\"2025-08-04T04:37:06+00:00\",\"description\":\"Learn to create custom PyTorch datasets using Dataset class. Implement __len__ and __getitem__ methods for efficient data loading.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.silicloud.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Create PyTorch Dataset: Step-by-Step Guide\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"name\":\"Silicon Cloud Blog\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\",\"name\":\"Silicon Cloud Blog\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"contentUrl\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"width\":1024,\"height\":1024,\"caption\":\"Silicon Cloud Blog\"},\"image\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\",\"https:\/\/twitter.com\/SiliCloudGlobal\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/2e83cc6ab9f60d36921c2d0f9f280f4a\",\"name\":\"Noah Thompson\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/350e537e1530ede2762ee0237e877d6693f4f7163ab4f303202cc9a6b27b6cb4?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/350e537e1530ede2762ee0237e877d6693f4f7163ab4f303202cc9a6b27b6cb4?s=96&d=mm&r=g\",\"caption\":\"Noah Thompson\"},\"url\":\"https:\/\/www.silicloud.com\/blog\/author\/noahthompson\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Create PyTorch Dataset: Step-by-Step Guide - Blog - Silicon Cloud","description":"Learn to create custom PyTorch datasets using Dataset class. Implement __len__ and __getitem__ methods for efficient data loading.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/","og_locale":"en_US","og_type":"article","og_title":"Create PyTorch Dataset: Step-by-Step Guide","og_description":"Learn to create custom PyTorch datasets using Dataset class. Implement __len__ and __getitem__ methods for efficient data loading.","og_url":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/","og_site_name":"Blog - Silicon Cloud","article_publisher":"https:\/\/www.facebook.com\/SiliCloudGlobal\/","article_published_time":"2024-03-14T12:52:32+00:00","article_modified_time":"2025-08-04T04:37:06+00:00","author":"Noah Thompson","twitter_card":"summary_large_image","twitter_creator":"@SiliCloudGlobal","twitter_site":"@SiliCloudGlobal","twitter_misc":{"Written by":"Noah Thompson","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/#article","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/"},"author":{"name":"Noah Thompson","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/2e83cc6ab9f60d36921c2d0f9f280f4a"},"headline":"Create PyTorch Dataset: Step-by-Step Guide","datePublished":"2024-03-14T12:52:32+00:00","dateModified":"2025-08-04T04:37:06+00:00","mainEntityOfPage":{"@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/"},"wordCount":172,"publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"keywords":["custom dataset","Dataset","Deep Learning","machine learning","PyTorch"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/","url":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/","name":"Create PyTorch Dataset: Step-by-Step Guide - Blog - Silicon Cloud","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/#website"},"datePublished":"2024-03-14T12:52:32+00:00","dateModified":"2025-08-04T04:37:06+00:00","description":"Learn to create custom PyTorch datasets using Dataset class. Implement __len__ and __getitem__ methods for efficient data loading.","breadcrumb":{"@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-for-creating-a-pytorch-dataset\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.silicloud.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Create PyTorch Dataset: Step-by-Step Guide"}]},{"@type":"WebSite","@id":"https:\/\/www.silicloud.com\/blog\/#website","url":"https:\/\/www.silicloud.com\/blog\/","name":"Silicon Cloud Blog","description":"","publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.silicloud.com\/blog\/#organization","name":"Silicon Cloud Blog","url":"https:\/\/www.silicloud.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","contentUrl":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","width":1024,"height":1024,"caption":"Silicon Cloud Blog"},"image":{"@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/SiliCloudGlobal\/","https:\/\/twitter.com\/SiliCloudGlobal"]},{"@type":"Person","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/2e83cc6ab9f60d36921c2d0f9f280f4a","name":"Noah Thompson","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/350e537e1530ede2762ee0237e877d6693f4f7163ab4f303202cc9a6b27b6cb4?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/350e537e1530ede2762ee0237e877d6693f4f7163ab4f303202cc9a6b27b6cb4?s=96&d=mm&r=g","caption":"Noah Thompson"},"url":"https:\/\/www.silicloud.com\/blog\/author\/noahthompson\/"}]}},"_links":{"self":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/10841","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/users\/14"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/comments?post=10841"}],"version-history":[{"count":2,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/10841\/revisions"}],"predecessor-version":[{"id":154612,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/10841\/revisions\/154612"}],"wp:attachment":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/media?parent=10841"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/categories?post=10841"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/tags?post=10841"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}