Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-7703] Creating an array with 10.000 elements is slow with optimization. #50243

Closed
swift-ci opened this issue May 16, 2018 · 27 comments
Closed
Assignees
Labels
array literals Feature → expressions → literals: Array literals bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself expressions Feature: expressions literals Feature → expressions: Literals such as an integer or string literal optimized only Flag: An issue whose reproduction requires optimized compilation performance SILOptimizer Area → compiler: SIL optimization passes swift 5.8

Comments

@swift-ci
Copy link
Collaborator

swift-ci commented May 16, 2018

Previous ID SR-7703
Radar rdar://problem/40334734
Original Reporter andreasw (JIRA User)
Type Bug

Attachment: Download

Additional Detail from JIRA
Votes 0
Component/s Compiler
Labels Bug
Assignee @eeckstein
Priority Medium

md5: a9f53d8799e79775521d84b544cb5974

is duplicated by:

Issue Description:

After #50173 and #50231 were fixed, compiling large arrays works fine without optimization, in a non-asserting build where the SILVerifier is disabled (still waiting for #50242).

Switching on -O slows down compilation again. See the attached Makefile how I create an array of 10.000 Int elements. Compiling with -O takes 16 seconds.

Here are the offending functions from perf:

+ 58,96% 0,05% swift swift [.] (anonymous namespace)::RedundantLoadElimination::run ▒
+ 57,33% 2,04% swift swift [.] (anonymous namespace)::BlockState::processStoreInst ▒
+ 50,06% 28,52% swift swift [.] llvm::DenseMapBase<llvm::SmallDenseMap<swift::LSLocation, unsigned int, 32u, llvm::DenseMa▒
+ 32,62% 2,89% swift swift [.] swift::LSLocation::isMayAliasLSLocation ▒
+ 32,01% 0,05% swift swift [.] swift::LSLocation::enumerateLSLocation ▒
+ 31,97% 0,07% swift swift [.] swift::LSLocation::enumerateLSLocations ▒
+ 21,43% 10,09% swift swift [.] swift::LSBase::hasIdenticalProjectionPath ▒
+ 18,76% 4,43% swift swift [.] swift::AliasAnalysis::alias ▒
+ 11,32% 11,25% swift swift [.] swift::ProjectionPath::computeSubSeqRelation ▒
+ 10,97% 10,95% swift swift [.] swift::ProjectionPath::hasNonEmptySymmetricDifference ▒
+ 8,96% 8,95% swift swift [.] swift::ValueEnumerator<swift::ValueBase*, unsigned long>::getIndex ▒
+ 8,49% 0,00% swift swift [.] swift::SILPassManager::runFunctionPasses ▒
+ 8,42% 0,00% swift swift [.] swift::SILPassManager::runPassOnFunction ▒
+ 6,72% 0,01% swift swift [.] llvm::DenseMapBase<llvm::SmallDenseMap<swift::LSLocation, unsigned int, 32u, llvm::DenseMa▒
+ 5,37% 5,36% swift swift [.] llvm::DenseMapBase<llvm::DenseMap<(anonymous namespace)::AliasKeyTy, swift::AliasAnalysis:▒
+ 5,35% 0,00% swift swift [.] llvm::DenseMapBase<llvm::SmallDenseMap<swift::LSLocation, unsigned int, 32u, llvm::DenseMa▒
@dcci
Copy link
Mannequin

dcci mannequin commented May 16, 2018

cc: @eeckstein / @gottesmm

@swift-ci
Copy link
Collaborator Author

Comment by Andreas Wendleder (JIRA)

(Temporarily) Disabling the RedundantLoadElimination pass brings it down to 1.3 seconds.

@gottesmm
Copy link
Member

@swift-ci create

@swift-ci
Copy link
Collaborator Author

swift-ci commented Nov 2, 2018

Comment by Andreas Wendleder (JIRA)

Xcode 7.1 is still unusable with this.

@swift-ci
Copy link
Collaborator Author

swift-ci commented Nov 16, 2018

Comment by Andreas Wendleder (JIRA)

I put together a nice little diagram showing the quadratic complexity of RLE:

https://docs.google.com/spreadsheets/d/16VakcOYvcwZbmPRLxO_1HcLUwCx0xEVT5mXWYsYnQjY/edit?usp=sharing

It shows that RLE is the worst offender in compile times with large matrices.

In effect, compiling an array with one million elements would take roughly 80 hours with Swift, whereas g++ with optimisation takes less than 4 seconds, and Python compiles and executes it within less than 2 seconds. I attached a new Makefile for C++ and Python.

Inlining is also a problem but this is already addressed in #51712.

@swift-ci
Copy link
Collaborator Author

swift-ci commented Nov 17, 2018

Comment by Andreas Wendleder (JIRA)

After #51712 was fixed this is the last remaining artefact that exhibits quadratic complexity with large arrays.

Please have a look at https://docs.google.com/spreadsheets/d/16VakcOYvcwZbmPRLxO_1HcLUwCx0xEVT5mXWYsYnQjY/edit#gid=0

I will have more information the next days.

cc @eeckstein @gottesmm trentxintong (JIRA User) @atrick

@swift-ci
Copy link
Collaborator Author

Comment by Andreas Wendleder (JIRA)

This is not a duplicate and not resolved with Swift 5.

Compiling 10.000 elements with optimisation takes 68 seconds on macOS 10.14.3 and Xcode 10.2 with Swift 5.0

@swift-ci
Copy link
Collaborator Author

Comment by Andreas Wendleder (JIRA)

Performance numbers with Xcode 10.2.1 and snapshot 2019-04-16 and 10.000 elements:

Xcode 10.2.1: 35 seconds

Snapshot: 30 seconds.

Commits 4e9a9cc and 8f47439 from @eeckstein seem to help only a little bit.

@swift-ci
Copy link
Collaborator Author

Comment by Andreas Wendleder (JIRA)

Swift 5.0.1 on Linux: 10.000 elements: 14 seconds.

@swift-ci
Copy link
Collaborator Author

Comment by Andreas Wendleder (JIRA)

Swift 5.1 macOS: 10.000 elements: 20 seconds.

@swift-ci
Copy link
Collaborator Author

swift-ci commented Oct 31, 2019

Comment by Andreas Wendleder (JIRA)

Swift 5.1.1 Linux: 5.000 elements: 6.9s

Perf trace:

+ 64,64% 0,00% swift swift [.] swift::SILPassManager::runFunctionPasses
+ 64,44% 0,01% swift swift [.] swift::SILPassManager::runPassOnFunction
+ 56,37% 0,08% swift swift [.] (anonymous namespace)::RedundantLoadElimination::run
+ 46,88% 0,00% swift swift [.] swift::SILPassManager::execute
+ 46,34% 0,00% swift swift [.] swift::SILPassManager::executePassPipelinePlan
+ 37,34% 20,76% swift swift [.] llvm::DenseMapBase<llvm::SmallDenseMap<swift::LSLocation, unsigned int, 32u, llvm
+ 37,20% 1,50% swift swift [.] (anonymous namespace)::BlockState::processStoreInst
+ 26,81% 0,11% swift swift [.] swift::LSLocation::enumerateLSLocations
+ 26,69% 0,04% swift swift [.] swift::LSLocation::enumerateLSLocation
+ 24,84% 0,03% swift swift [.] (anonymous namespace)::DeadStoreElimination::run
+ 23,47% 1,41% swift swift [.] swift::LSLocation::isMayAliasLSLocation
+ 20,78% 4,16% swift swift [.] swift::AliasAnalysis::alias
+ 17,86% 0,03% swift swift [.] (anonymous namespace)::DSEContext::processInstruction
+ 17,47% 7,85% swift swift [.] swift::LSBase::hasIdenticalProjectionPath
+ 11,56% 0,53% swift swift [.] (anonymous namespace)::DSEContext::processStoreInst
+ 10,50% 10,48% swift swift [.] swift::ValueEnumerator<swift::ValueBase*, unsigned long>::getIndex
+ 9,59% 9,56% swift swift [.] swift::ProjectionPath::computeSubSeqRelation

@swift-ci
Copy link
Collaborator Author

swift-ci commented Jun 8, 2020

Comment by Andreas Wendleder (JIRA)

Swift 5.2.4 Linux:

5.000 elements: 6.2s

10.000 elements: 25.4s

@swift-ci
Copy link
Collaborator Author

Comment by Andreas Wendleder (JIRA)

Swift 5.3.3 Linux:

10.000 elements: 16s

C++ (gcc or clang): 0.04s

Python: 0.04s

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@gonsolo
Copy link
Contributor

gonsolo commented Oct 4, 2022

Swift 5.7 Linux:

10.000 elements:

Swift: 5s
C++ (gcc): 0.028s
C++ (clang): 0.051s
Python: 0.048s

Related bugs:
#51712
#51723
#51762

My collection of swift bugs (with only these remaining): https://github.com/gonsolo/swift_bugs

cc @eeckstein @gottesmm @atrick

@gonsolo
Copy link
Contributor

gonsolo commented Oct 4, 2022

perf report:

88.15%     0.00%  swift-frontend   [unknown]                 [k] 0xffffffffffffffff
        |
        ---0xffffffffffffffff
           |          
           |--47.45%--_ZN12_GLOBAL__N_124RedundantLoadElimination3runEv
           |          _ZN12_GLOBAL__N_110RLEContext3runEv
           |          |          
           |           --46.63%--_ZN12_GLOBAL__N_110BlockState16processStoreInstERNS_10RLEContextEPN5swift9StoreInstE7RLEKind
           |                     |          
           |                     |--38.32%--_ZN5swift10LSLocation20isMayAliasLSLocationERKS0_PNS_13AliasAnalysisE
           |                     |          |          
           |                     |          |--22.78%--_ZN5swift13AliasAnalysis5aliasENS_8SILValueES1_NS_7SILTypeES2_

@gonsolo
Copy link
Contributor

gonsolo commented Apr 26, 2023

Swift 5.8 Linux, 10.000 elements: 5.3 seconds.

@AnthonyLatsis AnthonyLatsis added SILOptimizer Area → compiler: SIL optimization passes swift 5.8 literals Feature → expressions: Literals such as an integer or string literal expressions Feature: expressions optimized only Flag: An issue whose reproduction requires optimized compilation labels Apr 26, 2023
@gonsolo
Copy link
Contributor

gonsolo commented May 3, 2023

Running perf:

20.35%  swift-frontend   swift-frontend            [.] llvm::DenseMapBase<llvm::DenseMap<swift::AliasAnalysis::AliasCacheKey, swift::AliasAnalysis::AliasResult, llvm::DenseMapInfo<swift::AliasAnalysis::AliasCacheKey, void>, llvm::detail::DenseMapPair<swift::AliasAnalysis::AliasCacheKey, swift::AliasAnalysis::AliasResult> >, swift::AliasAnalysis::AliasCacheKey, swift::AliasAnalysis::AliasResult, llvm::DenseMapInfo<swift::AliasAnalysis::AliasCacheKey, void>, llvm::detail::DenseMapPair<swift::AliasAnalysis::AliasCacheKey, swift::AliasAnalysis::AliasResult> >::LookupBucketFor<swift::AliasAnalysis::AliasCacheKey>
    13.18%  swift-frontend   swift-frontend            [.] swift::ProjectionPath::hasNonEmptySymmetricDifference
     9.86%  swift-frontend   swift-frontend            [.] swift::AliasAnalysis::alias
     7.05%  swift-frontend   swift-frontend            [.] (anonymous namespace)::DSEContext::invalidateBase
     5.79%  swift-frontend   swift-frontend            [.] (anonymous namespace)::BlockState::processStoreInst

It seems to be the same issue as in #51712. @AnthonyLatsis @eeckstein @atrick

@AnthonyLatsis
Copy link
Collaborator

It seems to be the same issue as in #51712

We should probably close that one as a duplicate and keep this one open until we actually land a timing regression test, or odds are we will keep opening issues just because the stats changed without much improvement in build times.

@gonsolo
Copy link
Contributor

gonsolo commented Jan 16, 2024

Swift 5.9 Linux, 10.000 elements: 4.3 seconds.
C++ 0.05 seconds.
Python 0.05 seconds.

  18,84%  swift-frontend   swift-frontend            [.] bool llvm::DenseMapBase<llvm::DenseMap<swift::AliasAnalysis::AliasCacheKey, swift::AliasAnaly
  15,33%  swift-frontend   swift-frontend            [.] swift::ProjectionPath::hasNonEmptySymmetricDifference(swift::ProjectionPath const&) const
   9,90%  swift-frontend   swift-frontend            [.] swift::AliasAnalysis::alias(swift::SILValue, swift::SILValue, swift::SILType, swift::SILType)
   6,81%  swift-frontend   swift-frontend            [.] (anonymous namespace)::BlockState::processStoreInst((anonymous namespace)::RLEContext&, swift

@eeckstein
Copy link
Member

@gonsolo can you remind me what's the source code for this test?

@eeckstein
Copy link
Member

Ah, you tested with swift 5.9. Can you try with a recent development snapshot?

@gonsolo
Copy link
Contributor

gonsolo commented Jan 16, 2024

@gonsolo can you remind me what's the source code for this test?

It seems the attachment at the top of this issue is corrupted. Here is it again, it's a Makefile:

SIZE=10000
SHELL:=/bin/bash

CONFIGURATION=-O

SWIFTC=swiftc

all: size time_swift time_cpp time_python

size:
	@echo "Size: $(SIZE)"
	@echo ""

time_swift: bug.swift
	@echo "Swift:"
	@time $(SWIFTC) $(CONFIGURATION) bug.swift
	@echo ""

time_cpp: bug.cc
	@echo "C++ (GCC):"
	@time g++ -c -O3 bug.cc
	@echo ""
	@echo "C++ (Clang):"
	@time clang++ -c -O3 bug.cc
	@echo ""

time_python: bug.py
	@echo "Python:"
	@time python bug.py

bug.swift: Makefile
	@echo "let x: [UInt] = [" > $@
	@LANG=C printf '0x%x, ' $$(seq $(SIZE)) >> $@
	@echo "]" >> $@

bug.cc: Makefile
	@echo "unsigned int x[] = {" > $@
	@LANG=C printf '0x%x, ' $$(seq $(SIZE)) >> $@
	@echo "};" >> $@

bug.py: Makefile
	@echo "x = [" > $@
	@LANG=C printf '0x%x, ' $$(seq $(SIZE)) >> $@
	@echo "]" >> $@

time_tuple: tuple.swift
	@time $(SWIFTC) $(CONFIGURATION) tuple.swift

.PHONY: c clean e edit r report p perf
c: clean
clean:
	rm -f bug.cc bug.py bug.swift bug bug.o tuple perf.data perf.data.old
e: edit
edit:
	vi Makefile
p: perf
perf: bug.swift
	time perf record $(SWIFTC) $(CONFIGURATION) bug.swift
	perf report

@gonsolo
Copy link
Contributor

gonsolo commented Jan 16, 2024

Ah, you tested with swift 5.9. Can you try with a recent development snapshot?

Swift snapshot trunk January 15, 2024, Linux, 10.000 elements: 2.96 seconds.
C++ 0.048 seconds.
Python 0.032 seconds.

Better, still 60x slower.

Perf report:

   4,81%  swift-frontend   swift-frontend            [.] (anonymous namespace)::DCE::run()
   2,75%  swift-frontend   swift-frontend            [.] swift::isInstructionTriviallyDead(swift::SILInstruction*)
   2,15%  swift-frontend   swift-frontend            [.] swift::SILInstruction::getResultsImpl() const
   1,97%  swift-frontend   swift-frontend            [.] swift::BottomUpFunctionOrder::DFS(swift::SILFunction*)
   1,78%  swift-frontend   swift-frontend            [.] swift::SILInstruction::getAllOperands() const
   1,72%  swift-frontend   swift-frontend            [.] $s3SIL15InstructionListV4nextAA0B0CSgyF
   1,71%  swift-frontend   libswiftCore.so           [.] swift_release

@gonsolo
Copy link
Contributor

gonsolo commented Jan 16, 2024

That's with "-O" for Swift and "-O3" for C++. Without optimization:
Swift: 1.46s
C++ (GCC): 0.05s
C++ (Clang): 0.06s

   2,53%  swift-frontend   swift-frontend            [.] swift::BottomUpFunctionOrder::DFS(swift::SILFunction*)
   2,15%  swift-frontend   swift-frontend            [.] (anonymous namespace)::SILVerifier::visitSILInstruction(swift::SILInstruction*)
   2,02%  swift-frontend   swift-frontend            [.] swift::SILInstruction::getAllOperands() const
   1,89%  swift-frontend   libswiftCore.so           [.] swift_release
   1,37%  swift-autolink-  libswiftCore.so           [.] swift_conformsToProtocolMaybeInstantiateSuperclasses(swift::TargetMetadata<swift::InProcess>
   1,32%  swift-frontend   libswiftCore.so           [.] swift_conformsToProtocolMaybeInstantiateSuperclasses(swift::TargetMetadata<swift::InProcess>
   1,28%  swift-frontend   swift-frontend            [.] (anonymous namespace)::SILVerifier::_require(bool, llvm::Twine const&, std::function<void ()>
   1,22%  swift-frontend   swift-frontend            [.] bool llvm::DenseMapBase<llvm::DenseMap<std::pair<swift::SILValue, swift::Operand const*>, llv
   1,15%  swift-frontend   swift-frontend            [.] swift::SILInstruction::isMetaInstruction() const
   0,99%  swift-frontend   swift-frontend            [.] swift::SILModule::checkForLeaks() const

@eeckstein
Copy link
Member

Thanks for doing the experiments!
We'll never get to the C++ time because the internal implementations of Array and scalars, like Int, are much more complicated in Swift. The compiler optimizes all the complexity away, but this is at the cost of compile time.

@gonsolo
Copy link
Contributor

gonsolo commented Jan 16, 2024

Nevermind, I started with 16 seconds for 10.000 elements and quadratic complexity here (and I was using much bigger arrays where compile time was unusable). This is the last bug on my list that I collected while writing my renderer.

@gonsolo
Copy link
Contributor

gonsolo commented Feb 16, 2024

Swift version 5.11-dev (LLVM 48dba337c6a2104, Swift 823db1f)

10.000 elements: 2.9s
100.000 elements: 35.0s

Same stack trace as in #50243 (comment)

I think this can be closed, @eeckstein. Thanks for all the work involved.

@eeckstein eeckstein reopened this Feb 16, 2024
@AnthonyLatsis AnthonyLatsis added the array literals Feature → expressions → literals: Array literals label Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array literals Feature → expressions → literals: Array literals bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself expressions Feature: expressions literals Feature → expressions: Literals such as an integer or string literal optimized only Flag: An issue whose reproduction requires optimized compilation performance SILOptimizer Area → compiler: SIL optimization passes swift 5.8
Projects
None yet
Development

No branches or pull requests

5 participants