nikic

nikic

Member Since 11 years ago

JetBrains, Berlin, Germany

1 organizations

php.net

Experience Points
4.9k
follower
Lessons Completed
25
follow
Lessons Completed
76
stars
Best Reply Awards
81
repos

4158 contributions in the last year

nikic Most Used Languages
nikic GitHub Stats

6 Pinned

⚡ The PHP Interpreter
⚡ A PHP parser written in PHP
⚡ Fast request router for PHP
⚡ Extension that adds support for method calls on primitive types in PHP
⚡ Iteration primitives using generators
⚡ Extension exposing PHP 7 abstract syntax tree
Jun
23
8 hours ago
pull request

nikic merge to php/php-src

nikic
nikic

JIT/AArch64: Support shifted immediate

As pointed out by MikePall in [1], shifted immediate value is supported. See [2]. For example, add x0, x1, #4096 would be encoded by DynASM into add x0, x1, #1, lsl #12 directly.

In this patch, a helper is added to check whether an immediate value is in the two allowed ranges: (1) 0 to 4095, and (2) LSL #12 on all the values from the first range.

Note that this helper works for add/adds/sub/subs/cmp/cmn instructions.

[1] https://github.com/LuaJIT/LuaJIT/pull/718 [2] https://github.com/LuaJIT/LuaJIT/blob/v2.1/dynasm/dasm_arm64.lua#L342

Test: all ~4k .phpt test cases under tests/ Zend/tests/ ext/opcache/tests/jit/ can pass for Linux JIT/arm64. Note that in total 8 JIT variants are tested, covering ZTS/nonZTS, HYBRID/VM, and functional/tracing JIT.

Change-Id: I4870048b9b8e6c429b73a4803af2a3b2d5ec0fbb

Jun
22
1 day ago
Activity icon
created branch

nikic in nikic/llvm-project create branch perf/loop-distribute

createdAt 13 hours ago
push

nikic push llvm/llvm-project

nikic
nikic

Revert "[compiler-rt] Make use of undefined symbols configurable"

This reverts commit ed7086ad46f99f639b85ea6c8bda7c1a71be7c53. This reverts commit b9792638b0bfb308e0c7c125ac78f4ebf910c11b.

This breaks cmake with message:

CMake Error at llvm-project/compiler-rt/CMakeLists.txt:449:
  Parse error.  Expected "(", got newline with text "

commit sha: ae1093921fc83294a310cd6e7bb721970754ddcb

push time in 13 hours ago
push

nikic push llvm/llvm-project

nikic
nikic

[OpaquePtr] Support changing load type in InstCombine

When the load type is changed to ptr, we need the load pointer type to also be ptr, because it's not allowed to create a pointer to an opaque pointer. This is achieved by adjusting the getPointerTo() API to return an opaque pointer for an opaque pointer base type.

Differential Revision: https://reviews.llvm.org/D104718

commit sha: 7bb7fa12e73bd3c9fb66f05825758d729dd96ba5

push time in 13 hours ago
Activity icon
issue

nikic issue comment php/php-src

nikic
nikic

Remove " " being considered as an invalid filename for `is_file()`

Some builds fail because of it, last one to date is: https://travis-ci.com/github/php/php-src/jobs/517817543

nikic
nikic

This is probably test parallelisation issue. Some other test must be creating that file, else it wouldn't exist.

push

nikic push llvm/llvm-project

nikic
nikic

[OpaquePtr] Handle addrspacecasts in InstCombine

This adds support for addrspace casts involving opaque pointers to InstCombine, as well as the isEliminableCastPair() helper (otherwise the assertion failure would just move there).

Add PointerType::hasSameElementTypeAs() to hide the element type details.

Differential Revision: https://reviews.llvm.org/D104668

commit sha: e790d3667ed4d8f8df0b55f7c93fee0045c0e626

push time in 17 hours ago
push

nikic push llvm/llvm-project

nikic
nikic

[ConstantFold] Skip bitcast -> GEP transform for opaque pointers

Same as with the InstCombine transform, this is not possible for bitcasts involving opaque pointers, as GEP preserves opaqueness.

nikic
nikic

[ConstantFold] Delay fetching pointer element type

Don't do this while stipping pointer casts, instead fetch it at the end. This improves compatibility with opaque pointers for the case where the base object is not opaque.

commit sha: e638a290f7d0bb85dbf81ba34eaaeef8c8d1b42d

push time in 19 hours ago
push

nikic push llvm/llvm-project

nikic
nikic

[ConstantFolding] Separate conditions in GEP evaluation (NFC)

Handle to gep p, 0-v case separately, and not as part of the loop that ensures all indices are constant integers. Those two things are not really related.

commit sha: 04395fd6cb0949dfe628353cd61bcec3625b8c0d

push time in 23 hours ago
Jun
21
2 days ago
push

nikic push llvm/llvm-project

nikic
nikic

Reapply [InstCombine] Don't try converting opaque pointer bitcast to GEP

Reapplied without changes -- this was reverted together with an underlying patch.


Bitcasts having opaque pointer source or result type cannot be converted into a zero-index GEP, GEP source and result types always have the same opaque-ness.

commit sha: 39796e1ad02a45b09ac3ef9e3dc1906f28804a91

push time in 1 day ago
push

nikic push llvm/llvm-project

nikic
nikic

[InstCombine] Add test for bitcast of unsized pointer (NFC)

The bitcast should get folded into the select, but currently isn't due to an incorrect early bailout.

nikic
nikic

Reapply [InstCombine] Extract bitcast -> gep transform

Relative to the original patch, an InstCombine test has been added to show a previously missed pattern, and the Coroutine test that resulted in the revert has been regenerated.


Move this into a separate function, to make sure that early returns do not accidentally skip other transforms. This previously happened for the isSized() check, which skipped folds like distributing a bitcast over a select.

commit sha: e2c2124a4b5bad9cf2a1e23a6aef1b2ad753f504

push time in 1 day ago
push

nikic push llvm/llvm-project

nikic
nikic

[LoopUnroll] Don't modify TripCount/TripMultiple in computeUnrollCount() (NFCI)

As these are no longer passed to UnrollLoop(), there is no need to modify them in computeUnrollCount(). Make them non-reference parameters.

Differential Revision: https://reviews.llvm.org/D104590

nikic
nikic

Revert "[InstCombine] Extract bitcast -> gep transform"

This reverts commit d9f5d7b959de36085944d4a99a73f3053f953796. This reverts commit 5780611d7e044ef56c4214df2c236ef5e15545ab.

This causes a failure in Coroutine tests.

commit sha: 6922ab73a5a5b0d6a65f0b8796e5fae4345dbbd9

push time in 1 day ago
push

nikic push llvm/llvm-project

nikic
nikic

[InstCombine] Extract bitcast -> gep transform

Move this into a separate function, to make sure that early returns do not accidentally skip other transforms. There is already one isSized() check that could run into this issue, thus this change is not strictly NFC.

nikic
nikic

[InstCombine] Don't try converting opaque pointer bitcast to GEP

Bitcasts having opaque pointer source or result type cannot be converted into a zero-index GEP, GEP source and result types always have the same opaque-ness.

commit sha: 5780611d7e044ef56c4214df2c236ef5e15545ab

push time in 1 day ago
push

nikic push llvm/llvm-project

nikic
nikic

[InstCombine] Remove unnecessary addres space check (NFC)

It's not possible to bitcast between different address spaces, and this is ensured by the IR verifier. As such, this bitcast to addrspacecast canonicalization can never be hit.

commit sha: a969bdc56f66a3c059f6d70e574d11fda8354e2a

push time in 1 day ago
push

nikic push llvm/llvm-project

nikic
nikic

[OpaquePtr] Support opaque constant expression GEP

Adjust assertions to use isOpaqueOrPointeeTypeMatches() and make it return an opaque pointer result for an opaque base pointer. We also need to enumerate the element type, as it is no longer implicitly enumerated through the pointer type.

Differential Revision: https://reviews.llvm.org/D104655

commit sha: d9fe96fe264e72c0a5c58cdd40b4efa14d18f475

push time in 1 day ago
push

nikic push llvm/llvm-project

nikic
nikic

[OpaquePtr] Return opaque pointer from opaque pointer GEP

For a GEP on an opaque pointer, also return an opaque pointer (or vector of opaque pointer) result.

This requires explicitly enumerating the GEP source element type, because it is now no longer implicitly enumerated as part of either the source or result pointer types.

Differential Revision: https://reviews.llvm.org/D104652

commit sha: 9f779195d311c983031271d0243d6e6af988ce55

push time in 1 day ago
push

nikic push llvm/llvm-project

nikic
nikic

[Mem2Reg] Regenerate test checks (NFC)

commit sha: acefe0eaaf82c1d31a8d12e15751118eb40fe637

push time in 1 day ago
push

nikic push llvm/llvm-project

nikic
nikic

[Mem2Reg] Use poison for unreachable cases

Use poison instead of undef for cases dealing with unreachable code. This still leaves the more interesting case of "load from uninitialized memory" as undef.

commit sha: 80e0424b2ce9489bec73dbd3b920c4543a25feb1

push time in 1 day ago
push

nikic push llvm/llvm-project

nikic
nikic

[Mem2Reg] Regenerate test checks (NFC)

commit sha: 00a88a81d2adcac1f39bc2581c54302aabb1d1dd

push time in 2 days ago
Activity icon
issue

nikic issue comment php/php-src

nikic
nikic

Declare tentative return types for ext/json

nikic
nikic

@jrfnl For cases where a return type cannot be added either due to PHP version requirements, or library backwards-compatibility policies, the deprecation warning can be suppressed using the #[ReturnTypeWillChange] attribute.

For this particular case, if there are no additional library backwards-compatibility concerns, it's also possibly to use a more specific type -- almost all implementations of jsonSerialize() actually return array.

@kocsismate Maybe the deprecation notice should mention the attribute?

Jun
20
3 days ago
push

nikic push krakjoe/apcu

nikic
nikic

CI: build and test on Windows

nikic
nikic

Run GH workdflow on PR as well

commit sha: 6c39c034b7b0e1c9c2b8eedf9fdcbae3682ef07e

push time in 2 days ago
pull request

nikic pull request krakjoe/apcu

nikic
nikic

CI: build and test on Windows

pull request

nikic merge to php/php-src

nikic
nikic

Allow build/gen_stub.php to process multiple CLI file args

E.g. build/gen_stub.php *.stub.php will generate *_arginfo.h from multiple files.

Previously, gen_stub.php would silently ignore files after the first file.

Invoking gen_stub.php with no arguments will continue to process the entire directory.

push

nikic push llvm/llvm-project

nikic
nikic

[LoopUnroll] Use smallest exact trip count from any exit

This is a more general alternative/extension to D102635. Rather than handling the special case of "header exit with non-exiting latch", this unrolls against the smallest exact trip count from any exit. The latch exit is no longer treated as priviledged when it comes to full unrolling.

The motivating case is in full-unroll-one-unpredictable-exit.ll. Here the header exit is an IV-based exit, while the latch exit is a data comparison. This kind of loop does not get rotated, because the latch is already exiting, and loop rotation doesn't try to distinguish IV-based/analyzable latches.

Differential Revision: https://reviews.llvm.org/D102982

commit sha: 1ae266f4529fe17c11331f11db74428b879f3737

push time in 2 days ago
Jun
19
4 days ago
Activity icon
issue

nikic issue comment php/php-src

nikic
nikic

Major overhaul of mbstring (part 7)

FYA @nikic

The tests for UTF-7, UTF-8, UTF-16, and UTF-32 should finally land now.

If CI doesn't pass, you don't need to burn your time investigating it... as always, I will figure out what's wrong and fix it.

nikic
nikic

@alexdowad Well, given how nobody has complained about it in the last twenty years (or at least, I couldn't find any) it's probably not worthwhile.

push

nikic push llvm/llvm-project

nikic
nikic

[LoopUnroll] Push runtime unrolling decision up into tryToUnrollLoop()

Currently, UnrollLoop() is passed an AllowRuntime flag and decides itself whether runtime unrolling should be used or not. This patch pushes the decision into the caller and allows us to eliminate the ULO.TripCount and ULO.TripMultiple parameters.

Differential Revision: https://reviews.llvm.org/D104487

commit sha: 1bd4085e0bbc14ec61ab69c83464098622b2df56

push time in 4 days ago
Activity icon
issue

nikic issue comment rust-lang/rust

nikic
nikic

leading_zeros() return value is still bounds checked

leading_zeros() can return at most the number of bits of the underlying data type. The compiler does not seem to consider this during optimization.

Code

#![feature(core_intrinsics)]

const LOOKUP: [usize; 65] = [15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 13, 13, 13, 13, 13, 12, 12, 12, 11, 11, 10, 10, 9, 8, 7, 6, 5, 4, 3, 0, 0];

pub fn min_selector(i: u64) -> usize {
    let l_z = i.leading_zeros() as usize;
    // unsafe {core::intrinsics::assume(l_z < 65);}
    return LOOKUP[l_z];
}

Tested on rustc 1.51.0 (2fd73fabe 2021-03-23), nightly in the Compiler Explorer with -C opt-level=3.

I expect the unsafe block to have no effect, since u64::leading_zeros() shouldn't return a number greater than 64. Instead, the disassembly shows that there is a bounds check which disappears when the unsafe block is uncommented.

nikic
nikic

Right, this will be pulled in by the next LLVM upgrade.

Jun
18
5 days ago
push

nikic push llvm/llvm-project

nikic
nikic

[LoopUnroll] Simplify optimization remarks

Remove dependence on ULO.TripCount/ULO.TripMultiple from ORE and debug code. For debug code, print information about all exits. For optimization remarks, only include the unroll count and the type of unroll (complete, partial or runtime), but omit detailed information about exit folding, now that more than one exit may be folded.

Differential Revision: https://reviews.llvm.org/D104482

commit sha: 3308205ae9dd3b42e19b377157c642a04312f7fd

push time in 4 days ago
Activity icon
issue

nikic issue comment rust-lang/rust

nikic
nikic

add codegen option for using LLVM stack smash protection

LLVM has built-in heuristics for adding stack canaries to functions. These heuristics can be selected with LLVM function attributes. This PR adds a codegen option -C stack-protector={basic,strong,all} which controls the use of these attributes. This gives rustc the same stack smash protection support as clang offers through options -fstack-protector, -fstack-protector-strong, and -fstack-protector-all. The protection this can offer is demonstrated in test/ui/abi/stack-protector.rs. This fills a gap in the current list of rustc exploit mitigations (https://doc.rust-lang.org/rustc/exploit-mitigations.html), originally discussed in #15179.

Stack smash protection adds runtime overhead and is therefore still off by default, but now users have the option to trade performance for security as they see fit. An example use case is adding Rust code in an existing C/C++ code base compiled with stack smash protection. Without the ability to add stack smash protection to the Rust code, the code base artifacts could be exploitable in ways not possible if the code base remained pure C/C++.

Stack smash protection support is present in LLVM for almost all the current tier 1/tier 2 targets: see test/assembly/stack-protector/stack-protector-target-support.rs. The one exception is nvptx64-nvidia-cuda. This PR follows clang's example, and adds a warning message printed if stack smash protection is used with this target (see test/ui/stack-protector/warn-stack-protector-unsupported.rs). Support for tier 3 targets has not been checked.

Since the heuristics are applied at the LLVM level, the heuristics are expected to add stack smash protection to a fraction of functions comparable to C/C++. Some experiments demonstrating how Rust code is affected by the different heuristics can be found in test/assembly/stack-protector/stack-protector-heuristics-effect.rs. There is potential for better heuristics using Rust-specific safety information. For example it might be reasonable to skip stack smash protection in functions which transitively only use safe Rust code, or which uses only a subset of functions the user declares safe (such as anything under std.*). Such alternative heuristics could be added at a later point.

LLVM also offers a "safestack" sanitizer as an alternative way to guard against stack smashing (see #26612). This could possibly also be included as a stack-protection heuristic. An alternative is to add it as a sanitizer (#39699). This is what clang does: safestack is exposed with option -fsanitize=safe-stack.

The options are only supported by the LLVM backend, but as with other codegen options it is visible in the main codegen option help menu. The heuristic names "basic", "strong", and "all" are hopefully sufficiently generic to be usable in other backends as well.

nikic
nikic

Hmm, the test which fails includes a deliberate stack smash to test the effect of the stack smash protector option. I see three alternatives:

1. Try to tweak the stack smashing so it works on all targets. I am guessing that the "malicious" function address being cast and written as a 64-bit might be the problem on the 32-bit platform which fails (perhaps a cast to `usize` would work better?). But it could be other things, so this might require several attempts to get right. (On that note, is the core dump which the test log says was generated made available anywhere?)

I doubt it's available anywhere, but if you're feeling motivated you can reproduce locally. This should be a matter of running sudo src/ci/docker/run.sh dist-i586-gnu-i586-i686-musl and waiting a bit :) There is a --dev flag with which you can start a bash shell in the container afterwards and poke around, maybe run that particular test under gdb...

2. Limit the test to be `x86_64` only (maybe also Linux glibc only: I realize other systems may produce different error messages when stack smashing is prevented). Then the test serves as an executable high-level description of what the stack-protector option is intended to do. Seeing that it works on one platform also gives some confidence that it works on other platforms.

3. Remove the test altogether. I quite liked it as an end-to-end test, but one could argue that it is reasonable to keep only the codegen tests and trust that system libraries get the stack smash protection functionality right.

...but if there is no relatively straightforward way to make it work on other targets, then limiting it to just x86_64 sounds fine to me. I don't think there's cause to drop the test entirely. I think it's good to have an end-to-end test that everything works, but it's okay if it doesn't run everywhere. If it works on one target, then the machinery works, and the rest is really LLVM's problem, not ours :)

open pull request

nikic wants to merge nikic/PHP-Parser

nikic
nikic

Simplify BuilderHelpers::normalizeName() implementation

In order to get rid of the flag in BuilderHelpers::normalizeNameCommon() I have moved all the logic related to the normalization of the name to the BuilderHelpers::normalizeName() method and expr-related stuff to the BuilderHelpers::normalizeNameOrExpr() method which later calls the basic normalizeName() as well

nikic
nikic

This creates some coupling, in that this check needs to stay in sync with the types supported by normalizeName().

I think a possibly better way to go about this would be to have a tryNormalizeName() method that doesn't throw an instead returns null on failure, and then have

public static function normalizeName($name) {
    $name = self::tryNormalizeName($name);
    if ($name !== null) {
        return $name;
    }
    throw new \LogicException('Name must be a string or an instance of Node\Name');
}

and similar for the other method.

pull request

nikic merge to nikic/PHP-Parser

nikic
nikic

Simplify BuilderHelpers::normalizeName() implementation

In order to get rid of the flag in BuilderHelpers::normalizeNameCommon() I have moved all the logic related to the normalization of the name to the BuilderHelpers::normalizeName() method and expr-related stuff to the BuilderHelpers::normalizeNameOrExpr() method which later calls the basic normalizeName() as well