How can we build a PHP Encryptor?
Building a PHP Encryptor: Insights and Ideas
Have you ever wondered how to build your own PHP source code encryptor? While I'm not an expert in this field, I've delved deep into the subject and would like to share my findings, which might help others.
PHP interpreters work by loading data from the filesystem during every include operation, converting it into opcode, similar to bytecode. This process involves parsing through the Lexer and the AST (Abstract Syntax Tree).
Currently, there are a few encryption solutions available, such as IonCube, SourceGuardian, PhpBolt, and a newer option, phpHidden, though its security remains uncertain. These tools are closed-source, so we can only guess how they operate. However, they all use PHP extensions, typically written in C, that can hook into the PHP interpreter. One key function that can be hooked is `zend_compile_file()`, which is called each time a file is interpreted. This provides a starting point for modifying virtually everything from the read buffer to the AST and finally, the opcode array.
Why not simply decrypt during the reading process?
The issue is that non-opcode data is easily accessible from memory before it is parsed and lexed, making it vulnerable to decryption by hackers.
In my experiments with modifying the PHP interpreter (available on GitHub), I explored the file “zend_language_scanner.l”, which contains the functions used to read `.php` files from the filesystem. It appears that the buffer reads the entire file, suggesting that code manipulation between reading from the filesystem and interpretation is feasible.
However, as mentioned, simply encrypting and then decrypting a .php file before parsing and lexing is not secure due to memory vulnerability.
So, how do existing solutions accomplish secure encryption? The details are murky, but one can speculate:
Consider building your own encryptor by cloning the PHP interpreter and separating the lexer, AST, and opcode generator. Essentially, replicate what PHP does: read project files, parse them, lex them, create an AST, and convert them to opcode. However, instead of executing the opcode, encrypt it and store it in a php file.
You would then need a .so C extension for your customers that hooks into their PHP interpreter. This extension would allow PHP to read the .php file as usual, but instead of going through the lexer and AST, your extension decrypts the pre-generated opcode and loads it into memory for PHP to execute.
This method could potentially be more secure because a hacker would have to reverse-engineer the opcode from memory. Additionally, this approach might not slow down your PHP application since it skips the lexing and AST stages. The key is ensuring your decryption process is as fast as the combined lexing and AST generation time.