office-gobmx/include/svl/whichranges.hxx
Armin Le Grand (allotropia) 290c8f6e04 ITEM: Change SfxItemSet to use unordered_set
With all the changes done for Items we can now do deeper
basic changes to the ItemSet itself with manageable risk.

I already did https://gerrit.libreoffice.org/c/core/+/166455
aka 'ITEM: Add measurements for SfxItemSet usages' to
get some statistical information about the fill/usage grade
of the ItemSet's fixed PtrArray to SfxPoolItems, check
that out to get an own picture.

Those results show that an average usage is between some
extremes ranging from 0% to 50%, but when using more checks
and using multiple files/interactions/edits in all
applications we end up with around typical 12%-19% of that
array used, the rest is nullptr's.

Thus I thought about one of the initial ideas of this
series of changes (ITEM), to use a std::unordered_map
(A) instead of that fixed array of SfxPoolItem Ptr's (B).
Tthat again was for a complete type-based rewrite, which
I once did a POC, but the code cannot be adapted to that,
just too much work.

Those are very different in architecture, (B) is done
since a long time (since ever), but as pointed out above,
(A) is now possible. There are many aspects to it, let's
grep some:

Speed (iterate): (A) and (B) are both linear. (A) has
less entries, but may touch different mem areas
(buckets). (B) is linear, but many empty spaces which
are usually uselessly iterated.

Speed (access Item by WhichID): (A) is hashed by
WhichID, so mostly linear for unordered_set/hash.
(B) is in a linear array, but has to calculate the
offset for each WhichID access.

So I guess speed will mostly equal out.

Memory: (A) will be dynamically allocated (buckets),
but stl is highly optimized and may even re-use areas,
has to provide some extra info but will need less
places for Items since it's dynamic and can start empty.
(B) will be allocated once (except AllItemSet) and may
even be 'derived' to the ItemSet (SfxItemSetFixed), but
has to allocate all space at once.

I can go in lots of more detail here, but due to the
results of the statistics I just made a test now,
including measuring some results (will include in
gerrit, not here). I used two pro versions for that.
That way I have some data now.

Result is:
- It is now possible to do this, it runs stable :-)
- Speed: As expected, mostly no change
- Memory: Depending on usage, 0% to 25% less,
  usually around 8-10%

Side effects:
- SfxAllItemSet could be done without WhichRanges,
  thus without expensive 'merges'
- SfxItemSetFixed is not needed. While the idea is
  good, it needs a lot of extra stuff
- Access to Items is linear if set
- WhichRanges: Still needed, but for vaildity
  checking/filtering of ItemSet content
- WhichRanges: Worth to think about if these are
  needed at all, probably just exist for historical
  reasons since allocation/number of added Items
  was never ever dynamic -> just not allocatable

Putting the current version on gerrit, may still
trigger some UTs (checked SW/SC/SD...)

I did not like that functionality at ItemSet that
hands out a vector of the set items for cases where
to avoid iterating and deleting items at the same
time at an ItemSet, so changed to using a local
vector of remembered WhichIDs and deleting after
the iterator is done. I also saw some strange
usages of SfxItemIter in sw which i will have to
check.

Since there are still problems with UTs which I can
not reproduce locally I have now added asserts to
the case that an ItemSet gets changed with still
having active SfxItemIter(s). That is always an
error, but with the architecture of that fixed
array of ItemPtrs did not have devastating effects
(purely by coincidence).
With that asserts, UTs run well in SC and SD, but
in SW I get 11 (eleven!) asserts from the UTs, all
of them from 'ITEM: SfxItemSet ClearItem' BTW.
I guess these have to be fixed before thinking
about this change...

Good news is that all 11 cases were the same in SW,
in SwHistorySetAttrSet::SwHistorySetAttrSet which
does some strange things using two SfxItemIter in
parallel. Thus SW UTs are also clear and I see no
more errors caused by ItemSets being changed while
SfxItemIters are alive.
Bad news is that I still have errors to hunt...

NOTE: Have now cleaned all UTs, this showed that
there are some unexpected side-effects of the Items
being processed in another order when SfxItemIter
is used, also found one case where a
WhichRangesContainer is constructed for a
SfxItemSet using the set items from another ItemSet
and SfxItemIter to do so. There *might* be more
cases not covered by UTs.

NOTE: While speed stays the same and mem is reduced
up to 25% (see measurements in 1st comment) another
*important* aspect is that this frees the way for
using ItemSets *without* WhichRanges - these are
necessary mainly to create that fixed array of
pointers to the Items in a *manageable* size. With
a dynamic structure like unordered_set there is
in principle *no need* anymore to use WhichRanges
to pre-define the Items a Set could hold. There
is one exception: We have cases where one ItemSet
is set at another one with defined WhichRanges to
*filter* the contained Items - these would have to
be identified. This is a rare case and we would have
to check cases where an ItemSet gets set at another
ItemSet. This would be as if all ItemSets would be
AllItemSets in principle - much easier for everyone.

NOTE: Waited for 24.8 split just to not take
unnecessary risks.

Change-Id: I75b0ac1d8a4495a3ee93c1117bdf618702785990
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/166972
Reviewed-by: Armin Le Grand <Armin.Le.Grand@me.com>
Tested-by: Jenkins
2024-06-11 17:26:42 +02:00

181 lines
5.8 KiB
C++

/* -*- Mode: C++; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4; fill-column: 100 -*- */
/*
* This file is part of the LibreOffice project.
*
* This Source Code Form is subject to the terms of the Mozilla Public
* License, v. 2.0. If a copy of the MPL was not distributed with this
* file, You can obtain one at http://mozilla.org/MPL/2.0/.
*/
#pragma once
#include <sal/config.h>
#include <sal/types.h>
#include <svl/svldllapi.h>
#include <array>
#include <memory>
#include <cassert>
typedef std::pair<sal_uInt16, sal_uInt16> WhichPair;
namespace svl
{
namespace detail
{
constexpr bool validRange(sal_uInt16 wid1, sal_uInt16 wid2) { return wid1 != 0 && wid1 <= wid2; }
constexpr bool validGap(sal_uInt16 wid1, sal_uInt16 wid2) { return wid2 > wid1; }
template <sal_uInt16 WID1, sal_uInt16 WID2> constexpr bool validRanges()
{
return validRange(WID1, WID2);
}
template <sal_uInt16 WID1, sal_uInt16 WID2, sal_uInt16 WID3, sal_uInt16... WIDs>
constexpr bool validRanges()
{
return validRange(WID1, WID2) && validGap(WID2, WID3) && validRanges<WID3, WIDs...>();
}
// The calculations in rangeSize cannot overflow, assuming
// std::size_t is no smaller than sal_uInt16:
constexpr std::size_t rangeSize(sal_uInt16 wid1, sal_uInt16 wid2)
{
assert(validRange(wid1, wid2));
return wid2 - wid1 + 1;
}
}
template <sal_uInt16... WIDs> struct Items_t
{
using Array = std::array<WhichPair, sizeof...(WIDs) / 2>;
template <sal_uInt16 WID1, sal_uInt16 WID2, sal_uInt16... Rest>
static constexpr void fill(typename Array::iterator it)
{
it->first = WID1;
it->second = WID2;
if constexpr (sizeof...(Rest) > 0)
fill<Rest...>(++it);
}
static constexpr Array make()
{
assert(svl::detail::validRanges<WIDs...>());
Array a{};
fill<WIDs...>(a.begin());
return a;
}
// This is passed to WhichRangesContainer so we can avoid needing to malloc()
// for compile-time data.
static constexpr Array value = make();
};
template <sal_uInt16... WIDs> inline static constexpr auto Items = Items_t<WIDs...>{};
}
#define INVALID_WHICHPAIR_OFFSET (sal_uInt16(0xffff))
/**
* Most of the time, the which ranges we point at are a compile-time literal.
* So we take advantage of that, and avoid the cost of allocating our own array and copying into it.
*/
class SVL_DLLPUBLIC WhichRangesContainer
{
using const_iterator = WhichPair const*;
WhichPair const* m_pairs = nullptr;
sal_Int32 m_size = 0;
mutable sal_uInt16 m_TotalCount = 0;
// variables for buffering the last used WhichPair to allow fast answers
// in doesContainWhich
mutable sal_uInt16 m_aLastWhichPairOffset = INVALID_WHICHPAIR_OFFSET;
mutable sal_uInt16 m_aLastWhichPairFirst = 0;
mutable sal_uInt16 m_aLastWhichPairSecond = 0;
/** if true, we allocated and need to delete the pairs, if not, we are pointing
* at a global const literal */
bool m_bOwnRanges = false;
/**
* Determines the number of sal_uInt16s in a container of pairs of
* sal_uInt16s, each representing a range of sal_uInt16s, and total capacity of the ranges.
*/
void CountRanges() const;
public:
#if !defined NDEBUG
inline bool validRanges2() const
{
for (sal_Int32 i = 0; i < size(); ++i)
{
auto p = (*this)[i];
if (!svl::detail::validRange(p.first, p.second))
return false;
// ranges must be sorted
if (i < size() - 1 && !svl::detail::validGap(p.second, (*this)[i + 1].first))
return false;
}
return true;
}
#endif
WhichRangesContainer() = default;
WhichRangesContainer(std::unique_ptr<WhichPair[]> wids, sal_Int32 nSize)
: m_pairs(wids.release())
, m_size(nSize)
, m_TotalCount(0)
, m_aLastWhichPairOffset(INVALID_WHICHPAIR_OFFSET)
, m_aLastWhichPairFirst(0)
, m_aLastWhichPairSecond(0)
, m_bOwnRanges(true)
{
CountRanges();
}
template <sal_uInt16... WIDs>
WhichRangesContainer(svl::Items_t<WIDs...>)
: m_pairs(svl::Items_t<WIDs...>::value.data())
, m_size(svl::Items_t<WIDs...>::value.size())
, m_TotalCount(0)
, m_aLastWhichPairOffset(INVALID_WHICHPAIR_OFFSET)
, m_aLastWhichPairFirst(0)
, m_aLastWhichPairSecond(0)
, m_bOwnRanges(false)
{
CountRanges();
}
WhichRangesContainer(const WhichPair* wids, sal_Int32 nSize);
WhichRangesContainer(sal_uInt16 nWhichStart, sal_uInt16 nWhichEnd);
WhichRangesContainer(WhichRangesContainer const& other) { operator=(other); }
WhichRangesContainer(WhichRangesContainer&& other);
~WhichRangesContainer();
WhichRangesContainer& operator=(WhichRangesContainer&& other);
WhichRangesContainer& operator=(WhichRangesContainer const& other);
bool operator==(WhichRangesContainer const& other) const;
const_iterator begin() const noexcept { return m_pairs; }
const_iterator end() const noexcept { return begin() + size(); }
bool empty() const noexcept { return m_size == 0; }
sal_Int32 size() const noexcept { return m_size; }
WhichPair const& operator[](sal_Int32 idx) const noexcept
{
assert(idx >= 0 && idx < size() && "index out of range");
return m_pairs[idx];
}
void reset();
// calculate and return if nWhich is member of one of
// the local WhichRanges and such contained in the
// defined range
bool doesContainWhich(sal_uInt16 nWhich) const;
// Adds a range to which ranges, keeping the ranges in valid state (sorted, non-overlapping)
SAL_WARN_UNUSED_RESULT WhichRangesContainer MergeRange(sal_uInt16 nFrom, sal_uInt16 nTo) const;
sal_uInt16 TotalCount() const { return m_TotalCount; }
};
/* vim:set shiftwidth=4 softtabstop=4 expandtab cinoptions=b1,g0,N-s cinkeys+=0=break: */