macmee Posted August 10, 2015 Report Share Posted August 10, 2015 I'm just curious in general about DHTs and found this great article on kademlia:http://gleamly.com/article/introduction-kademlia-dht-how-it-worksFrom what I understand and based on these resources on how the bittorrent DHT works:https://en.wikipedia.org/wiki/Mainline_DHTIt seems like, when I begin seeding a torrent, my client does the following:determines the key within the DHT corresponding to my torrentthe value for a key is a list of seeders, and my client inserts its own IP into that listbut, consider an attacker with 100 clients, what's to stop that attacker from inserting his own, or a bogus IP for the key for this torrent, even though the attacker is not seeding the torrent at all.And then, wouldn't it be the case that when some person X comes along to start downloading the torrent, when they look up the seeders in the DHT (find the value in the DHT for the key corresponding to the torrent), then wouldn't X get back a list of 100 bogus IPs and perhaps some small amount of legitimate seeders? Quote Link to comment Share on other sites More sharing options...
Lasent Posted November 4, 2016 Report Share Posted November 4, 2016 Even though this post a year old, I've just discovered this forum, I'll try to give an answer to whoever still reads this. I just wrote a small paper about BitTorrent's DHT implementation (or Mainline DHT/MLDHT). One of the biggest shortcoming of it, as you just pointed out, is the fact that a node can freely choose it's own ID. The biggest reason behind this is the fact that for a DHT to work efficiently, the node's ID-s have to be as evenly spread out as it possibly can. Now how would you bind an ID to a node without any kind of central server? If you think about it, the only way would be to use the node's public IP address. Of course, this isn't that simple either. For starters, a node behind a NAT can't figure out it's own public address without the help of another node. This means that a node has to reply with the IP address he sees the request coming, so that the node behind the NAT can calculate an ID based on it. This opens up another attack: a bogus node could reply with a wrong IP address, which could also cause other (altough not as significant) problems. Another thing is actually generating the node ID. An IPv4 address is 32 bit long, while a node ID is 180 bit, so only using the IP address wouldn't be enough (can't generate every possible 2^180 node address from 2^32 IP address). Long story short: this is a proposed change : http://www.bittorrent.org/beps/bep_0042.html Quote Link to comment Share on other sites More sharing options...
macmee Posted February 22, 2017 Author Report Share Posted February 22, 2017 Even though my post is a year old, the other day I did an experiment which is relevant exactly to the concerns I asked about: Any node can announce to the network that they're seeding a torrent. I generated a random SHA1 and announced I was seeding it. 24 hours goes by and I ask the DHT to do a lookup for the SHA1. I'm not expecting to get anything back because I assume the announce interval is like 15 minutes or something. But to my surprise, I get back, from like 20 peers, two IPs from China and Saudi Arabia. So for some reason, two nodes in China and Saudi Arabia thought it would be a great idea and lie about seeding a torrent which doesn't even exist. Why would they do this? What's the advantage in doing it? Quote Link to comment Share on other sites More sharing options...
crawldht Posted December 19, 2017 Report Share Posted December 19, 2017 The condition you are explaining is actually a very potential attack called sybil attack which is not only a problem with BitTorrent but also with Tor and Bitcoin. A malicious node can easily setup thousands of nodes of himself. So whenever a client asks for peers, the attacker will give him false IP Addresses which are actually the IP Addresses controlled by the attacker. Thus putting that client into Denial of Service. The attacker returns false IP Addresses to every client who comes to fetch list of peers from him. This technique is actually being used by copyright holders who want to slow down the propagation of their copyright data. The efficiency of attack depends upon the number of malicious nodes an attacker signs up. The attacker can also sign up peers so that he can distribute incorrect data and if he happens to be both DHT node and a peer, he can give you torrent metadata of his malicious data and may start distributing it. It is just that in the end you will realize that hash of torrent doesn't match with the infohash. Quote Link to comment Share on other sites More sharing options...
microft Posted April 7, 2020 Report Share Posted April 7, 2020 I would guess that there are entities that want a copy of every new torrent that comes out. The NSA comes to mind. Merely the fact that some human bothered to share some collection of files makes that collection valuable to entities that can afford to collect everything. Ok so I assume you just announced the hash_value but didn't stick around and manage peers. So these other entities came in and fulfilled that role. They're not seeding, they're trying to download it. I recently started experimenting with DHT and so am familiar with it (somewhat). I haven't dug into the bittorrent side of things too much. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.