MaidSafe And What It Might Mean For Digital Asset Management

March 10, 2016 By Ralph Windsor in Emerging DAM Technologies

Last year, I wrote an article about the future role of data storage exchanges where capacity could be traded and considered how they might work and what risks they might introduce. While researching some of our other recent feature articles, I came across something called MaidSafe which appears set to provide that (and goes quite a bit further). The potential of what MaidSafe offers is more than just storage and it seems more like a protocol that will support application stacks in the same way that http etc do, but storage is one clear use-case that all DAM News readers should be able to understand, even if the technicalities of the implementation are harder for them to grasp.

From what I can tell, MaidSafe is a portmanteau of MAID (Massive Array Of Internet Disks) and SAFE (Secure Access For Everyone). MAID is a concept that has existed for some time, the ‘SAFE’ part is a decentralised peer-to-peer network which MaidSafe have devised. They use a kind of ‘Star Trek’ approach to redundant data storage where objects (i.e. files) are split into numerous fragments, encrypted and then distributed across the SAFE network with multiple independent copies of each piece. When the user requests the file to download it, these are retrieved, decrypted and served. There are more details on the features section of their website:

“When a user uploads (or saves) a ﬁle to the network, via one of the SAFE Network apps, the ﬁle is automatically broken up into chunks. These chunks are then encrypted (encoded so that only authorised parties can read it), randomised and stored on the computers of other SAFE Network users. These encrypted chunks are completely unreadable and inaccessible to anyone other than the owner.” [Read More]

The capacity is provided globally by everyone who has a SAFE client installed with a computer that has some spare capacity. If a node disappears (e.g. gets switched off, unplugged from the internet etc) then another takes its place. Four copies of each fragment are maintained at all times and there is some intelligent caching used to increase performance for popular objects. The SAFE Network is not yet fully live and is still in the process of being tested prior to an official launch, although there has been tech press attention from Wired and TechCrunch.

The incentive to provide storage is generated via a process called farming, which is similar to Bitcoin mining, however, where it differs is that this is not an exercise which is an entirely self-serving one (as Bitcoin mining is). Instead, it reflects the volume of resources that the ‘farmer’ contributes. As discussed in my other articles about digital commodities, I believe this is a more plausible approach because it has intrinsic value, i.e. capacity for data storage and processing power to encrypt it so that it remains private.

Where the situation becomes more complex is that the digital token used to reward farmers is called a SafeCoin and these will also be required to purchase capacity. This is another altcoin (alternative to Bitcoin). An intermediate token called MaidSafeCoin is being issued and I understand these will get exchanged for SafeCoin when the network goes live. As I have mentioned before when discussing digital commodities, I am not sure if these trading tokens will help their cause in the longer-term (or perhaps the medium-term, at least). I gather that many use them as a means to generate an initial investment (in fact, the term IPO is used – a direct reference to the Initial Public Offering of stocks on more conventional equity exchanges, even though as securities they are quite a lot different). As with other tokens, these are highly volatile and the price fluctuates across a huge range.

There is libertarian sub-text to MaidSafe (and their ilk) which is something of a double-edged sword, especially if you are a prospective corporate user of these technologies. The tokens or coins are not just to help raise funds, but also because they facilitate privacy and being able to avoid having your activities tracked. The flip-side of this is that you cannot just buy storage in regular fiat currencies (e.g Dollars, Euros etc) but have to purchase these floating rate tokens which might crash or rapidly increase in price within the space of a single day. For conventional commodities, dedicated exchanges have existed for hundreds of years and special contracts like futures, forwards etc are available so commercial buyers and sellers can agree a guaranteed price that is stable for the duration of the contract (and speculators can trade in and out of them or use spot prices if they so wish). To offer this, however, a degree of centralisation might be required to enforce regulation – which seems to be in conflict with the objectives of many of the architects of these protocols. It is highly possible that there are alternative methods to realise the aforementioned goals without needing some regulatory authority to oversee and enforce compliance, but I haven’t read about them as yet.

With reference to this issue, unlike Bitcoin and similar, MaidSafe make a point of not using a blockchain (i.e. distributed ledger of all the transactions). They cite the ballooning size of the Bitcoin blockchain as it grows in popularity and need to distribute updated copies of it as one of the reasons why they do not use it. The issue with that, however, is that there isn’t an external method to track the distribution of each of the fragments of your data across the SAFE network (at least that I could identify). A number of the clients I work with (especially public sector ones) have strict regulations that their data is not permitted to be stored outside key jurisdictions (e.g. the EU or mainland USA etc). I don’t know if you can stipulate that with a SAFE client, nor unequivocally prove it, but it does not appear that you can currently.

A good point advanced by MaidSafe is that depositing your data with a cloud provider is potentially risky because they are opaque and you have limited visibility about their business or any IT-specific practices they are engaged in. Your data goes into a kind of managed black hole and you have to hope you can get it out again, further, you don’t necessarily know how secure it is and (to an extent) you are obliged to depend on the assurances of the provider. This always one of the key issues with cloud storage: ‘hope for the best’ and ‘safety in numbers’ are not valid risk management techniques, nor is the current size/reputation of the provider (as a number of high profile financial organisations have demonstrated in the last ten years). The only partial solution to this problem using the kind of commodity cloud storage providers that most DAM users have available to them is to store your data with more than one. That reduces the risk of data loss, but it actually increases the chances that your data will be compromised, unless you encrypt it yourself prior to storage. Real-time data transmission to multiple storage locations and developing hand-rolled encryption (even if you use some third party components to do it) is all quite a lot of work to both implement and maintain. This is where MaidSafe (and other technologies that might use the same concepts) could offer some advantages that might be very interesting to those responsible for providing digital asset libraries.

At present, MaidSafe is still in the ’emerging’ category only (and to be fair, they haven’t even fully launched yet). Assuming solutions to some of the issues described can be devised, for corporate DAM, it could initially offer a credible alternative to long-term archival services like Amazon Glacier in the first instance and eventually be a far superior option to the single provider real-time cloud storage options currently on the market like S3, Google Cloud Storage, Azure etc. What may need to happen to enable this, however, is some kind of optional secondary tier that sits above the core protocol that supports at least the following:

The ability to forward-book capacity at a pre-agreed price (i.e. a futures contract)
Options to stipulate the geographical regions where data is allowed to be held
Transactional transparency so you can see where your data is stored (i.e. what nodes)
Auditing to identify when data was retrieved and by who.

These would probably need to be offered independently for those who specifically need them (and only with those nodes who agree to participate). I note that according to MaidSafe, once the storage is purchased, you keep it forever (because you own their tokens). I don’t know how practical that will be over the longer-term and I have not properly considered the implications of it, but one option with a secondary protocol that has the features I have described is that the capacity purchased is not perpetual, but instead is just for a set period (e.g. a year). Given that most organisations of any size tend to run on annual budgets and they have to re-purchase cloud storage now every month (and budget for it at the start of a fiscal year) that might be a trade-off they would accept.

One further point for DAM vendors to consider is the model employed by MaidSafe. While copying them to offer distributed data storage is possibly not the best idea as they appear to have spent eight years working on it, as we have described many times in the past, there are many other value enhancements that DAM solutions generate. These usually get diluted because the firms who develop them have diffuse objectives and are unable to concentrate on specific areas of expertise sufficiently to make real progress. Anyone who is willing to see the DAM rat race for what it is inexorably in the process of becoming may wish to reflect on some of these innovations which are taking place outside the DAM market, as it exists now and consider its implications for their own Digital Asset Management operations.

Share this Article:

Related Posts:

Emerging DAM Technologies

5 Comments

optictopic

March 11, 2016 at 1:59 am

Nice article. I have been following this project for the past couple of years and noticed a few minor points that I thought I would clarify. First Maid stands for Massive Array of Internet (not idle) Disks. Secondly the ability to choose which nodes hold your information is impossible and is in fact a feature as this increases security and also because of churn (churn is when a computer that hold bits of your information is turned off the autonomous network immediately produces another copy and chooses where that copy goes based on XOR space and not geographical location). There are other points I would make but if you have read this far you might as well go to the dedicated forum to get a better understanding. forum.safenetwork.io
Ralph Windsor

March 11, 2016 at 3:47 pm

On the ‘idle’ vs ‘internet’ question for the ‘i’ in Maid, I think I got that from Wikipedia and a few other places, ‘independent’ being the other suggestion. ‘Internet’ sounds more reasonable though, so I’ve adjusted the article.

On the other point, I’ve contributed to the thread at forum.safenetwork.io. To summarise what I’ve said there: I can see why not everyone would want these features, but for most enterprises, it is essential as they are legally obliged to be able to audit where their data is stored (and some also have to keep it within certain regional boundaries). These restrictions currently would prevent SAFE being used and they are not likely to get overturned any time soon, in fact they will probably become more demanding.

What I envisaged is an option extension to the SAFE protocol that provide some of these capabilities (which I accept not everyone will want). Anyone who doesn’t see the value in them can use the original core protocol. I wouldn’t regard it as unreasonable for the operators of the nodes that support this extension to charge a fee for providing their services, which could be combined with a forward contract booking charge to get a guaranteed price for storage.
optictopic

March 11, 2016 at 4:06 pm

In the case of ones data being audit-able and verifiably within the constraints of a geographic region, I do not see how that could be modified at the app level to change fundamentals at the core level. Perhaps in cases where auditing is paramount an integration with Factum (factum.org) is ideal.
Ralph Windsor

March 11, 2016 at 4:31 pm

I’m not sure on the technicalities of this (and my software engineering skills are in steep decline these days, if they ever were any good to start with) but I’m guessing this is probably like Tor or something similar where each node knows about its peers but not the final destination of a given packet. Given that the client has to be able to retrieve all the fragments to serve the file, I can’t quite believe it’s impossible to achieve, however, otherwise you wouldn’t be able to get your data back again.

I can accept that additional metadata might need to get added for this extra optional layer. Further, to get an up to date audit, requests could be required to ‘refresh’ this data periodically (which means it might not always be accurate at the client end). Further, I can acknowledge that implementing this would reduce the security compared with the original method, but these are trade-offs which many people would tolerate to get the key benefits and still remain compliant (and they would almost certainly pay for the privilege too).

A lot of this comes down to whether those involved with SAFE regard enterprise markets as a priority for them. If they are, it probably will get solved, if not, they won’t, it’s that simple. I would suggest they should be something they think seriously about, however, as it will encourage wider adoption which might ultimately affect the long-term prospects for the protocol.
Bitcoin

March 14, 2016 at 8:41 pm

I believe that SafeNet will be a succesfull project and will change the paradigm

MaidSafe And What It Might Mean For Digital Asset Management

5 Comments

Leave a Reply