Solidity considered harmful

Preface

I love crypto. I am fascinated by what smart contracts can do. However, expressive smart contracts are a ridiculously young concept. Ethereum launched in 2015; that’s almost 8 years ago! We are still experimenting and learning the best way to develop them as an industry. This blog post is not dunking on smart contracts or the team behind Solidity.

As I delve deeper into Solidity, my concerns only continue to grow. The language is plagued with jank and seemingly endless amounts of horrifying trivia. It’s alarming to see how DeFi is built upon a language seemingly designed by candlelight.

Solidity is fundamentally harmful and broken. It’s ridiculously easy to pick up and start writing, and because of that, it’s made out to be the best thing since sliced bread by beginners who have never written anything else(!).

This article is very opinionated, and I’m writing this to scream my frustrations about Solidity into the void. So here goes…

While this article was written, the latest version of Solidity is “v0.8.19”. Keep in mind it can be outdated.

The EVM

To understand why EVM languages are designed as they are, some basic knowledge of how the EVM functions is required. The reader should be familiar with the terms stack, memory, and bytes. I won’t go over exactly how the EVM works, but ill quote the ethereum.org wiki, and I recommend you read the full article if you are interested in learning more:

The EVM executes as a stack machine with a depth of 1024 items. Each item is a 256-bit word, which was chosen for ease of use with 256-bit cryptography (such as Keccak-256 hashes or secp256k1 signatures).

During execution, the EVM maintains a transient memory (as a word-addressed byte array), which does not persist between transactions.

Contracts, however, do contain a Merkle Patricia storage trie (as a word-addressable word array), associated with the account in question and part of the global state.

Compiled smart contract bytecode executes as a number of EVM opcodes, which perform standard stack operations like XOR, AND, ADD, SUB, etc. The EVM also implements a number of blockchain-specific stack operations, such as ADDRESS, BALANCE, BLOCKHASH, etc.

-Ethereum virtual machine, ethereum.org

With that out of the way we need to clear up the biggest misconception about the EVM: THE EVM IS NOT TURING COMPLETE.

It is often advertised that the EVM is a turing complete virtual machine, but that is only partially true. While the EVM can act as a turing machine in some scenarios, it cannot be considered turing complete when used on a blockchain. This is due to the block gas limit, a measure used to prevent DOS attacks and limit state bloat. The gas limit determines the maximum amount of storage read/writes and compute that can happen on the blockchain. As a result, all programs running on the chain must fit within the gas limit, meaning that all programs will eventually stop executing (as per the halting problem). Therefore, in practice, the EVM is not turing complete. For now, keep this in mind. This will become relevant once we look at the design choices solidity took. This also leads us to my first main gripe with Solidity:

Solidity has an identity crisis

Solidity has an identity crisis, attempting to be both a high-level and low-level language. In the end, Solidity is Java mated with C++ in such a way that the best genes of both failed to exert a phenotype.

Solidity was designed with ease of use in mind. Its syntax is reminiscent of languages like JS because it was designed for anyone to pick up and learn. While this approach may be suitable for simple contracts, the language’s mixture of low-level features with high-level abstractions results in an awkward and cumbersome syntax.

As a result, solidity programmers often have to fight with the language and usually resort to using inline Yul for what would be trivial in a low-level language.

Type system

Consider the following code snippet:

uint32 to = uint32(bytes4(bytes20(uint160(bytes20(accessList[k][i] & 0xffffffff000000000000000000000000000000)) >> 120)));
uint32 from = uint32(bytes4(bytes20(uint160(bytes20(accessList[k][i] & 0x00000000ffffffff0000000000000000000000)) >> 88)));
uint32 amount = uint32(bytes4(bytes20(uint160(bytes20(accessList[k][i] & 0x0000000000000000ffffffffffffffff000000)) >> 24)));

This code extracts to, from, and amount from a single solid bytes. Solidity can only cast types of equal byte size or type. So to extract everything properly, we need to size it down to bytes20, then convert that to uint160, where we do our maths. After, we cast it to bytes20, equal cast it to bytes4, and then finally to uint32.

This example perfectly demonstrates Solidity’s identity crisis. It’s too low level for an autocast and too high level to deal directly with bytes.

But why not just use yul?

See here.

No proper decimal support

Solidity recently got experimental support for decimal numbers, but you cannot perform arithmetics on them, so this is still kinda relevant

The EVM only has integers which is good. Floating point numbers are very unreliable, so it makes sense to only do integers. Even though vyper has built-in abstractions for decimal numbers, Solidity has yet to do so.

Instead, we must use various third-party libraries or resort to custom solutions. The first option adds dependencies on third-party authors. In the worst case, this can result in a hack due to a coding error or a supply chain attack. The second option is better in terms of security, but it can be time-consuming and still not completely error-free. Why decimal support wasn’t enshrined like SafeMath was, is beyond me.

Syntax

Writing readable and auditable code is crucial when designing smart contracts. It helps you, the developer, understand your code better. You’re much more likely to spot potential bugs, logic issues, or other exploits when the codebase you’re working on is easy to read. Proper formatting, naming, and code comments can also give your less technically inclined users a very high-level overview of what your contracts do. This is important if your users need to trust your code with, for example, their life savings.

This is why the trend of writing less readable code is concerning. Solidity is generally very readable; as a developer or user, you can quickly glance at the code and see what it’s supposed to do. Gas golfing being so prevalent these days, among some less than-desirable features of the language, Solidity code is becoming less and less readable.

Modifiers

Function modifiers can be used to amend the semantics of functions in a declarative way (see Function Modifiers in the contracts section).

-Solidity docs

While this might seem ok, especially since they’re so common, I cannot overstate how bad this is for readability! Take a look at the following contracts:

contract PurchaseModifier {
    address public seller;

    modifier onlySeller() { // Modifier
        require(
            msg.sender == seller, "Only seller can call this.");
        _;
    }

    function abort() public onlySeller { // Modifier usage
        // ...
    }
}

contract Purchase {
    address public seller;

    function abort() public { 
        require(msg.sender == seller, "Only seller can call this.");
        // ...
    }
}

It’s quite clear that the second contract is much more readable. Now consider that modifiers are usually hidden away in random imports…

import "./modifiers.sol";

contract PurchaseModifier is Modifiers {
    function abort() public onlySeller { // Modifier usage
        // ...
    }
}

… and you start to see how problematic they can be.

But you might be asking,

“oh, but most modifiers just check for things like if the owner calls the contract.”

This is fair, but if they are doing something that can be accomplished in 1 loc, why not include it in the function?

”… but what about reentrancy guards.”

Complete crutch and waste of gas in 99.9% of use cases. Use the check-effects-interaction pattern.

Inline yul

Complexity is the enemy of security

- Nikolai Mushegian

Gas golfing with Yul feels a lot like trying to fly a plane from the back seat of another plane following behind the first.

The inline assembly you write may or may not save a bunch of gas, but one thing is for certain. It’s sure as hell going to make your code completely unreadable. We’re reaching a critical point where many newly released contracts have a big part of their business logic written in yul. I have the utmost respect for people that can write yul efficiently. But yul being hard to read remains a fact. I expect we will start to see a lot of business logic exploits in the wild occurring due to extreme uses of Yul.

This really says it all about how readable yul is.

User-Defined Operators

User-defined operators are a brand new feature introduced in solidity 0.8.19 and are a way to make user-defined types easier to work with, among other things. In essence, it’s syntactic sugar. But as we’ve seen from other languages, getting carried away with them is easy. People forget that it’s, just syntactic sugar and just another way to call a function. This usually leads to problems and will especially be problematic in the realm of smart contract design.

The most obvious problem is that it leads to illegible code. You can no longer tell at a glance what a piece of code is doing. A lot of the time, it’s obvious, but a lot of the time, it can be misleading. In most cases, using plain old functions is much more legible than operator overloading.

Operator precedence

To properly overload an operator, you must also understand transitivity, commutativity, and distributivity and when and how they apply to different operators.

Perhaps you’re a math expert who can easily navigate all of this with ease. However, can you rely on your fellow programmers to have the same level of proficiency? Remember that you might have to work on and debug their code. (Which is one of the reasons why a language like Go got so popular)

Furthermore, many programmers struggle to write an is-odd function. Given this, it’s difficult to trust the same individuals to adhere to the much more demanding requirements for +, -, *, and / operators.

Legitimate use cases

There are, however good use cases for operator overloading. For example, overloading operators for custom decimal types makes sense. These are all instances where +, -, /, and * are well-defined and hard to get wrong.

Mishandling of `DELEGATECALL`

DELEGATECALL is an opcode that creates a new subcontext as if calling itself, but with the code of the called contract. It enables a contract to delegate its execution context to another contract, executing code on behalf of the calling contract while preserving its storage and context. It is an incredibly important part of modern smart contract design, allowing things like proxies and upgradeable contracts to work.

delegatecall

Due to being able to call arbitrary code with the storage of the code it’s called from, it’s very easy to blow yourself up while using it. Solidity design assumes that code execution and storage access are always linear and predictable. This isn’t really true for DELEGATECALL, and because of that, you have to go out of your way to use a low-level call to use it. While it’s not unreasonable to gate keep dangerous features, it can also cause more harm than good.

Solidity doesn’t have support for safe storage access with DELEGATECALL. As a result, developers need to be super careful about how they’re working with storage slots. This, however, can be error-prone and difficult to do correctly, especially for complex contracts.

Payable and other hidden “features”

This one is very opinionated, you might disagree, and I could perfectly understand why. This is just a gripe I’ve had with Solidity that I found extremely nonintuitive.

Solidity is really weird about not trusting you and your users to act rationally. What I mean by this is that there are tons of checks everywhere to not let you do dumb things.

For example, Solidity has a hidden check on every non-payable function call that causes your call to revert if you send eth to it. This prevents users from accidentally sending eth to your contract when it can’t process it.

A much less sensible check is for non-payable constructors. Solidity also has a check by default to disallow sending eth to a contract on creation. While I understand why this might be wanted for user-facing functions, this makes no sense for constructors. Nobody deploys a contract and sends eth to it on accident.

A better solution

Disallowing accidental ether transfers definitely has a use. A much more sensible solution would be to have something like nonpayable, which adds a check if you explicitly want to disallow ether transfers.

♿♿♿`solc`♿♿♿

solc is the original compiler designed along with solidity. While it is technically not part of the language, it is the reference implementation and the two are so closely interlinked, it only makes sense to mention it.

It’s no secret that solc is really bad at optimizing code. This is very evident when you look at the following:

Solidity vs solyul

While this might be an unfair comparison, it is still a very good starting point. One is written in a whatever-level-sol-is language, and the other one is essentially written in assembly.

But when you look at other compilers, like clang for example, you will very rarely see such huge differences in performance (compiled C code is often faster than hand-written assembly unless the programmer really knows how to take advantage of all the specific processor instructions and quirks). This is because clang is packed full of various optimizations that modify your code as long as it doesn’t change its semantics.

Free memory pointer

Both contracts have been compiled with solc v0.8.18+commit.87f61d96

Solidity has support for dynamic size types. To facilitate this, a free memory pointer is initialized. In assembly form it looks something like this:

PUSH 0x80
PUSH 0x40
MSTORE

Consider the following contract:

contract free {
    bytes c;
    function a(bytes memory b) public {
        c=b;
    }
}

bytes size is dynamic, and it makes sense for it to need a free memory pointer. Now let’s take a look at the following contract:

contract free {
    bytes32 c;
    function a(bytes32 b) public {
        c=b;
    }
}

bytes32 is not a dynamic type, so the compiler should know exactly how much memory it needs. But if we look at the assembly, solc still allocates a free memory pointer!

`--via-ir` hell

op-ti-mize [verb (trans.)]* … (solc) to modify executable code so that it fails more quickly and spectacularly.

--via-ir is a feature of solc that uses an IR (yul) of your code to hopefully optimize it. It’s in a perpetual beta state for what feels like an eternity(but it looks like it will be the default by eoy. may lord have mercy on our coins) and is broken for many use cases.

Using IR also changes the semantics of your code. While they might not impact most users, it is completely unacceptable to have something like this in a smart contract language! A less aware programmer could enable IR to get those sweet gas optimizations and unknowingly completely change how their contract behaves. A solution to this would be to have a pragma, just like we have for solc versions, or like we had for abicoderv2

An anecdote

To my knowledge, this behaviour was fixed. Still funny

>Bruh the IR compiler is like a fucking monke yon crack
>I spent the whole day today trying to refactor the least amount of code possible into libraries to get this contract under 24kb
>And to get it to compile I had to make some room on the stack for the extra storage pointers the library needs
>So I removed 3 named return variable inits or so
>and the code goes down by 10kb
>what the actual fuck
>that was not 10kb of shit i removed

>makemake
>Huh what
>Diff?

>it chooses which functions to inline
>and its decision is based on the size of the code contained within that function
>so when you make a seemingly small change it can significantly affect the architecture of the solc output

>makemake
>And I assume this isn't well documented?

>oh no not at all
>literally zero documentation

>makemake
>So in addition to producing weird near impossible to debug bugs and garbage code/instructions it does this too
>Truly amazing

Conclusion

In conclusion, Solidity has numerous design flaws and limitations that make it less than pleasant to work with. The language’s mixture of high-level and low-level features makes it awkward and cumbersome to use. The lack of compiler optimization forces programmers to write inline assembly to not have to deal with the compiler. While Solidity has its advantages, such as its ecosystem, it is clear that the language is fundamentally poorly designed.

If you’re looking for an alternative language that is designed with more straightforward and sensible choices, consider giving Vyper a try. Its performance and more rational design make it an attractive option. I might even write a blog post about it!