BITDEER
article

Why Mining Pools Work in Distributed Networks but AI Training Does Not

2026.05.08

This blog uses simple language to explain the key issues behind this difference between Bitcoin mining and AI training. We show why mining pools are naturally fit for distributed setups, why large AI model training struggles to run this way, and why model training is so sensitive to network latency.

As DePIN gains momentum in 2026, many operators and investors are asking the same question: if mining pools can connect miners around the world, why can't DePIN do the same thing for AI training? People think that since crypto mining pools can connect thousands of miners across the globe to mine together, DePIN should be able to do the same thing. They believe DePIN can connect idle GPUs scattered across different data centers to train large language models together.

This idea sounds logical on paper, but in real engineering, it faces major bottlenecks caused by network latency, bandwidth, and synchronization mechanisms.

Bitcoin mining and large AI model training are two completely different computing tasks that have very different requirements for network quality. This blog uses simple language to explain the key issues behind this difference. We show why mining pools are naturally fit for distributed setups, why large AI model training struggles to run this way, and why model training is so sensitive to network latency.

Why Mining Pools Can Use a Distributed Setup while AI Training Struggles

To see the difference clearly, we cannot just look at the outer shell of the servers. We must look at how the machines speak to each other while they work.

Mining Pools Connect Independent Machines

Bitcoin miners (ASIC chips) do a very simple job. They guess random numbers inside the chip to find a hash result that matches the network difficulty. During this process, each miner acts as a completely independent unit.

A miner does not need to know what the machine next to it, or a machine across the ocean, is calculating. It does not need to pause and wait for other machines to return middle data. The calculations happen directly inside the chip, and there is no complex data transfer. The only time a miner communicates with the pool is every few seconds when it uploads its calculated shares as a tiny data packet.

Because of this, mining pool networks have very low requirements for speed and latency. Even if your mining farm is in a remote valley with a network latency of hundreds of milliseconds, your miners will still generate revenue steadily as long as the connection stays active.

AI Training Connects GPUs That Must Work Together

Training large AI models uses the exact opposite logic. When you connect tens of thousands of GPUs to train a large model, they are not running separate small tasks. Instead, they are working together to build a single massive project.

Modern commercial large models use trillions of parameters, which cannot fit into the memory of a single GPU. Engineers must split the model into countless small pieces and distribute them across different GPUs. Each GPU only processes one step or one part of a math matrix. The result from GPU A must go to GPU B instantly. The updates from GPU B must sync, check, and update with all other GPUs across the network before everyone can move to the next step together.

This means AI training is not just about adding everyone's compute power together. It requires a large number of GPUs to stay connected over a high-speed network to minimize the wait time for synchronization after each step.

Independent Compute vs. Heavy Sync Compute

We can use a simple real-life example to explain this:

Bitcoin mining is like a group of people buying lottery tickets at home. Everyone rolls their own numbers without getting in each other's way. Whoever wins simply calls the headquarters to report it. Even if these people live far away and have slow transportation, it does not hurt the overall efficiency of buying lottery tickets.

Training a large model is more like writing a book with thousands of people at the same time. Every writer has to stop after each sentence, compare notes with everyone else, and agree on the next step before moving forward.

In this book-writing setup, if even one person passes their work slowly, everyone else must sit and wait, no matter how fast they write. This is the challenge of heavy synchronization compute. Mining works well in distributed environments, while AI training suffers from delays.

Why Large Model Training Fears Network Latency

Now that we know the differences between these tasks, we can look at a common mistake that beginners make. Why is AI training about more than just how fast a GPU can calculate?

Compute Is Only the Base, Data Flow Is the Key

When evaluating an AI compute center, many people like to count the number of graphics cards and look at their raw speed or memory size. But in large-scale distributed training, fast chips are only the first step.

What truly decides the actual power of a cluster is how fast data can flow between different GPUs. Model training is a loop of calculation, communication, and more calculation. If the calculation only takes a few milliseconds but the communication takes dozens of milliseconds, even the most expensive and advanced chips will waste a lot of time waiting for data to sync.

Parameter Sync and Gradients Form the Core Bottleneck

In real AI training, the biggest bottleneck is called AllReduce synchronization.

Simply put, when thousands of graphics cards process different training data for one round, they each create an update suggestion for the model. To keep the model from going wrong, these thousands of cards must swap their suggestions over the network, blend them together, and calculate a final answer. Then, each card updates its parameters and takes the next batch of data.

If network latency is high and bandwidth is low, the GPUs spend too much time sitting idle while waiting for data synchronization, and their actual use rate will drop significantly.

Data Center Networks Belong in a Different Class Than the Public Internet

To clear this waiting bottleneck, professional AI data centers invest heavily in networks. Switches, network cards, optical modules, and high-speed cables become major cost items.

In a professional AI center, GPUs do not just link to the internet. They connect tightly through NVLink, InfiniBand, or high-performance RoCE networks. The GPUs inside a server swap data at ultra-high speeds, and the racks use dedicated switches and fiber cables to maintain low-latency communication. The goal is to keep thousands of GPUs from waiting or idling, allowing them to work together like a single system.

DePIN uses a different model. It tries to connect GPUs scattered across different cities or countries using the public internet. The public internet works well for regular data transfers, but it cannot meet the low-latency, high-bandwidth, and heavy synchronization requirements of large model training.

The public internet usually has a latency of dozens of milliseconds, along with unpredictable data loss and network jitter.

The gap between microseconds and dozens of milliseconds can be thousands of times wide. If you throw a tightly coupled task that needs microsecond-level sync into a public network with millisecond-level latency, physics dictates the result. The overall training efficiency will drop significantly. In extreme cases, it dramatically increases training time and cost, making the project lose its business appeal.

Why Industrial Mining Farms Struggle to Become Distributed AI Training Nodes

Since distributed AI training requires strict network quality, we can see another practical fact in the industry. Why do traditional industrial mining farms with cheap and abundant power face huge physical barriers when trying to shift to distributed AI?

This is a challenge caused by a mismatch in infrastructure logic.

Traditional Farm Prioritize Power Prices

Over the past decade of development, the number one rule for picking a Bitcoin mining farm has always been to find cheap and abundant power. Many large farms sit in deep mountains, high plateaus, or deserts far from big cities. They operate next to cheap hydro plants, wind farms, or stranded gas sites. This location choice is smart for mining. Miners do not need to move massive amounts of external data. They only need to connect to the internet and submit hash shares on time. Even if the network latency is high, it does not hurt the marginal profit of the farm.

AI training infrastructure uses the exact opposite logic. It needs low-latency nodes close to the core internet backbone first, and energy costs second. A mining farm that sits in a remote location where network data must jump through multiple nodes to reach the internet backbone will struggle to meet the needs of heavy sync compute from day one.

External Network Form the First Physical Bottleneck

Some mining farms have plenty of power, but their buildings only connect to basic commercial fiber or standard broadband networks due to early planning limits.

When upgrading the hardware for a secondary use, the question is not whether the farm can connect to the internet. The question is whether the external network can offer a stable, high-bandwidth path of several gigabits per second (Gbps) to support frequent data alignment between global nodes. Public internet routing, peak-hour traffic, and data loss will significantly lengthen the wait time for synchronization in large model training. For these farms, the physical limits of the external network form a tough barrier.

Internal Network Upgrade Costs Are Hard to Bear

Even if you can pull dedicated network lines from the main internet backbone right to the door of the mining farm, rebuilding the internal network requires heavy capital expenditure.

Traditional ASIC mining rooms only need cheap switches and standard network cables inside the racks. The network setup is flat and loose. To run an AI training cluster, the facility must upgrade its systems thoroughly. It needs to swap in high-performance switches, pay for expensive wiring, and upgrade semi-open buildings into sealed data rooms with strict temperature and humidity controls.

This upgrade is not a simple task of plugging in GPUs to run AI. It is a system-level rebuild that almost tears down the original server room. The cost of this modification often exceeds the financial limits of small and medium operators.

Does DePIN Lack Opportunities in the AI Space Entirely?

Since globally scattered GPUs struggle to handle the heavy sync training of large models directly, does this mean the DePIN model holds no value in the AI era? We should clarify that training across different data centers is not impossible in theory. Large cloud providers and AI infrastructure firms are testing this model using dedicated networks, data compression, and optimized setups. But these solutions rely on highly engineered networks and scheduling systems. They are not the same as loose DePIN nodes connecting over the public internet.

The key point is that not all AI tasks need to sync data as frequently or strictly as cutting-edge large model training. If you drop the overly optimistic idea of using loosely distributed nodes to train ultra-large models, and look instead at cost-sensitive secondary computing markets that tolerate latency, DePIN still has a clear space to survive in AI.

Shifting Focus to Low-Priority Inference

Unlike the training phase, the inference phase of an AI model (where a user inputs a prompt and the model creates a result) is highly independent by nature. The commercial market has a massive volume of long-tail inference tasks that do not care about fast response times. These include batch offline text processing, non-real-time content checks, internal corporate knowledge base searches, and customer support help systems that do not need instant answers.

These tasks can run in queues and do not need millisecond-level feedback. If a DePIN network can use loose, idle older GPUs across the world to offer compute rentals far cheaper than commercial public clouds, it becomes a highly attractive choice for budget-conscious clients who care about unit costs.

Batch Rendering and Video Transcoding

Outside of AI inference, a large volume of traditional heavy computing tasks fits the distributed setup of DePIN perfectly. Examples include movie-grade rendering for 3D animation, batch transcoding for high-dynamic video, and basic scientific computing tasks. These tasks have very clear traits: you can cut them into countless independent small pieces.

The nodes do not need to talk to each other while they work; they just submit each frame as soon as they finish calculating it. As long as you can split the task safely and verify the results, a distributed GPU network can use its low-cost stacking advantage.

Finding Proper Fine-Tuning and Edge Tasks

Under specific conditions, small-scale model fine-tuning (training small models for specific industries) or edge AI tasks are also potential areas for DePIN. The data movement and sync frequency for these tasks are far lower than for frontier large models, keeping the pressure on the network setup within a controllable range.

In summary, the true role for DePIN in AI is not to replace top-tier, highly centralized fast data centers. Instead, it acts as a flexible, cost-effective tool to serve secondary computing needs that care heavily about costs and tolerate latency.

How Do Different Compute Tasks Compare in Network Needs?

To see the boundaries of compute assets across different business lines clearly, we can compare how different tasks rely on networks and latency in this table:

Business TypeCore Computing TaskNetwork Bandwidth NeedLatency ToleranceDistributed Suitability
Bitcoin MiningHash guessing, finding random numbers, submitting sharesExtremely Low (Kbps level)Extremely High (Second-level latency does not hurt core output)Perfect Fit (Naturally matches a decentralized distributed setup)
AI Large Model TrainingParameter sync, frequent full-data updatesHigh (Relies on top data center networks; larger sizes need more)Low (High latency drops GPU use and training efficiency easily)Bad fit for heavy sync model training; needs careful evaluation
AI Edge InferenceReceiving a single input, creating real-time outputsMedium (Depends on the data size of a single input)Low to Medium (Commercial real-time interaction needs low latency)Partially Fit (Depends primarily on the response time needs of the task)
Low-Priority Batch InferenceOffline data checks, large batch text vector processingMediumHigh (Allows asynchronous queue execution)Good Fit (A preferred outlet for cost-sensitive tasks)
Batch / Offline RenderingFrame-by-frame 3D rendering, batch video transcodingMedium (Mainly during downloading and uploading phases)Extremely High (As long as results deliver on time)Perfect Fit (Tasks can be split completely without interfering with each other)
Small-Scale Fine-TuningAdjusting local parameters, training vertical industry modelsMedium to HighMedium (Depends on training size and sync setup)Needs Evaluation (Requires a careful trade-off between bandwidth and local efficiency)

This table reveals a basic rule of the industry: whether a computing task can be distributed globally does not depend on how many machines you own. It depends on whether the computing task itself needs high-frequency, forced data synchronization at a micro level.

What Questions Should a Farm Ask Before Shifting to DePIN or AI?

As a rational operator, before moving your existing infrastructure into DePIN or AI computing tracks, you should run a systematic audit by asking four core questions to avoid blind investments:

Is my network quality fit for AI tasks?

This is the first step of the transition. You need a professional team to evaluate your facility. Does your server room have a path to a high-quality internet backbone? Is the total external bandwidth stable during peak hours? Can the network round-trip time (RTT) stay within your target range? Can the internal wiring and setup upgrade to a high-performance, low-latency switching network? If these network foundations are weak, you should drop heavy sync training tasks immediately.

What type of load fits my power setup?

Bitcoin mining is a classic flexible load. You can join grid demand-response programs anytime to swap shutdowns for cheap power credits. But AI services and enterprise inference need rigid 24/7 power without any fluctuations. You need to check if your power agreement allows you to run GPUs at full load continuously for long periods. Does your facility have enough backup systems, like high-standard backup generators, to handle sudden power cuts?

What type of task fits my hardware?

When buying or reusing graphics cards, do not just look at benchmarking scores. You must check the specs item by item. Does the VRAM size meet the minimum threshold for your target task? Will memory bandwidth create a bottleneck for data flow? Will official drivers and computing frameworks keep supporting and optimizing this old chip architecture over the coming years? Is the actual failure rate and spare parts cost of used hardware acceptable after running under heavy loads?

Does my revenue model make financial sense?

This is the most critical ledger item. The essence of a transition is not whether you can connect to a network, but whether your revenue can cover all holding costs. Operators must build a strict financial model that includes regional power rates, network modification costs, device depreciation, real order availability, labor maintenance expenses, and potential opportunity costs into one calculation.

You Cannot Copy Mining Pool Success Directly to DePIN AI Training

Crypto mining pools can use a decentralized model to connect a massive number of miners because PoW mining tasks have very low requirements for network communication. Each machine works on its own and only reports results periodically.

But training large AI models is different. It is essentially a single system that requires extreme teamwork. It has rigid dependencies on network bandwidth, setup latency, and overall stability. Therefore, when industrial mining farms look to pivot to DePIN or AI compute, they cannot simply copy the loose logic of mining pools connecting the world.

In the hybrid compute cycle of 2026, mature compute asset managers know how to follow physical laws and financial boundaries:

  • Mining Tasks: Good fit for distributed, low-communication, and highly flexible long-cycle monetization.
  • Large Model Training: Belongs entirely in highly centralized dedicated compute centers with ultra-high bandwidth and microsecond-level latency.
  • DePIN Distributed Networks: Should target cost-sensitive tasks pragmatically, such as low-priority inference, offline rendering, batch jobs, and lightweight fine-tuning.

The question is no longer whether DePIN can connect GPUs around the world. The real question is whether the workload itself fits a distributed environment. For mining operators, success depends on whether the workload matches their power, network, and hardware conditions. You can use the Bitdeer mining calculator to enter your specific operational numbers and evaluate the payback period and long-term ROI under different hardware and infrastructure mixes with real data.


AIMiningBeginner

*Information provided in this article is for general information and reference only and does not constitute nor is intended to be construed as any advertisement, professional advice, offer, solicitation, or recommendation to deal in any product. No guarantee, representation, warranty or undertaking, express or implied, is made as to the fairness, accuracy, timeliness, completeness or correctness of any information, or the future returns, performance or outcome of any product. Bitdeer expressly excludes any and all liability (to the extent permitted by applicable law) in respect of the information provided in this article, and in no event shall Bitdeer be liable to any person for any losses incurred or damages suffered as a result of any reliance on any information in this article.