最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

c++ - Getting a mutable element from a std::set, for operations that won't break the order - Stack Overflow

programmeradmin3浏览0评论

I've got a data type that consists of actual data and an id:

struct Data {
    int value;
    int DataId;
};

I've got a collection of these data from which I have to frequently get an element based on its id and modify its value (but not it's ID). I thus chose a std::set in order to have an efficient sort, with the following comparator:

class Comparator {
    /// dummy type for declaring is_transparent type
    struct SIsTransparentTag;

   public:
    // tag (falsely) indicating that operator() is transparent, i.e. accepts
    //  any type of argument: this is a workaround to allow std::set::find to
    //  use a int key instead of a full Data object.
    //  
    //  /
    using is_transparent = SIsTransparentTag;

    // comparator between two Data
    bool operator()(const Data &lhs, const Data &rhs) const {
        return lhs.DataId < rhs.DataId;
    }
    // comparator between a Data and an id
    bool operator()(const Data &lhs, const int id) const {
        return lhs.DataId < id;
    }
    // comparator between an id and a Data
    bool operator()(const int id, const Data &rhs) const {
        return id < rhs.DataId;
    }
};

Then I wrote a read-write getter:

Data* GetRWDataFromID(std::set<Data,Comparator> &collection,int id)
{
    auto it = collection.find(id);
    if (it == std::end(collection))
    {
        return nullptr;
    }
    // UB as soon as I will modify Data!!!
    return const_cast<Data *>(&(*it));
}

Playground

std::set::find() only returns an iterator to a non-mutable element. However, I only want to be able to modify the value member, thus never changing the set order.

This design is broken because removing constness of an object and modifying it is undefined behavior: .

How can I fix my design while keeping an efficient search (typically logarithmic) and letting the id attached to a data (when I get a data, I want to get its id from it, not from some book-keeping; thus I'd like to avoid a std::map)?

A bonus would be to have DataId non-mutable once a Data object is created (for this one, I imagine I can make it private and use read-only getters?).

I've got a data type that consists of actual data and an id:

struct Data {
    int value;
    int DataId;
};

I've got a collection of these data from which I have to frequently get an element based on its id and modify its value (but not it's ID). I thus chose a std::set in order to have an efficient sort, with the following comparator:

class Comparator {
    /// dummy type for declaring is_transparent type
    struct SIsTransparentTag;

   public:
    // tag (falsely) indicating that operator() is transparent, i.e. accepts
    //  any type of argument: this is a workaround to allow std::set::find to
    //  use a int key instead of a full Data object.
    //  https://en.cppreference/w/cpp/container/set/find
    //  https://www.fluentcpp/2017/06/09/search-set-another-type-key/
    using is_transparent = SIsTransparentTag;

    // comparator between two Data
    bool operator()(const Data &lhs, const Data &rhs) const {
        return lhs.DataId < rhs.DataId;
    }
    // comparator between a Data and an id
    bool operator()(const Data &lhs, const int id) const {
        return lhs.DataId < id;
    }
    // comparator between an id and a Data
    bool operator()(const int id, const Data &rhs) const {
        return id < rhs.DataId;
    }
};

Then I wrote a read-write getter:

Data* GetRWDataFromID(std::set<Data,Comparator> &collection,int id)
{
    auto it = collection.find(id);
    if (it == std::end(collection))
    {
        return nullptr;
    }
    // UB as soon as I will modify Data!!!
    return const_cast<Data *>(&(*it));
}

Playground

std::set::find() only returns an iterator to a non-mutable element. However, I only want to be able to modify the value member, thus never changing the set order.

This design is broken because removing constness of an object and modifying it is undefined behavior: https://en.cppreference/w/cpp/language/const_cast.

How can I fix my design while keeping an efficient search (typically logarithmic) and letting the id attached to a data (when I get a data, I want to get its id from it, not from some book-keeping; thus I'd like to avoid a std::map)?

A bonus would be to have DataId non-mutable once a Data object is created (for this one, I imagine I can make it private and use read-only getters?).

Share Improve this question edited Mar 22 at 13:05 Jarod42 220k15 gold badges196 silver badges331 bronze badges asked Mar 21 at 22:27 OerstedOersted 2,9586 silver badges29 bronze badges 8
  • Modifying an object after const_cast has removed constness is UB only if the object was const to begin with. But that is not the case here. You simply have a const reference/pointer to a non-const (mutable) object, so removing the constness from the reference/pointer is not UB. – Remy Lebeau Commented Mar 21 at 22:51
  • @RemyLebeau: How do you know that the object was not originally const qualified? Isn't that an implementation detail of std::set if the values are stored as const or not? – gerum Commented Mar 21 at 22:56
  • 7 Why can't you use a std::map? You get a Data from somewhere and then you can add it to your map like my_map[data.id] = data.value; – NathanOliver Commented Mar 21 at 23:07
  • Using at least C++17, you can extract a std::set node, modify it, and insert it back. – Eugene Commented Mar 21 at 23:37
  • @NathanOliver absolutely, this is screaming out for a std::map – catnip Commented Mar 21 at 23:56
 |  Show 3 more comments

4 Answers 4

Reset to default 4

Just use std::map directly.

when I get a data, I want to get its id from it, not from some book-keeping; thus I'd like to avoid a std::map

Inside std::map, it store the key and value as a std::pair, so you do not need any book-keeping to get both key and value.

Further reading: more information about std::map

You might (ab)use of mutable:

struct Data {
    mutable int value;
    int DataId;
};

then you can return a const Data* and modify its value. Caveat is that you no longer have direct possibility to have "const" value.

One way to do this is to make your data value mutable like this:

struct Data {  
    mutable int value;  
    int DataId;  
};

Then return a const* as normal:

const Data* GetRWDataFromID(std::set<Data,Comparator> &collection,int id)
{
    auto it = collection.find(id);
    if (it == std::end(collection))
    {
        return nullptr;
    }
    // UB as soon as I will modify Data!!!
    return &(*it);
}

Now you can still modify the mutable element.

Alternatively you can use a different container and manage the order yourself.

Something a bit like this:

// some utility functions to insert and remove sorted
// unique values from a vector

auto insert_unique(std::vector<Data>& v, Data&& d)
    -> std::pair<std::vector<Data>::iterator, bool>
{
    auto lb = std::lower_bound(std::begin(v), std::end(v), d, Comparator());

    if(lb != std::end(v) && lb->DataId == d.DataId)
        return {lb, false};

    return {v.insert(lb, std::forward<Data>(d)), true};
}

auto insert_unique(std::vector<Data>& v, Data const& d)
    { auto copy = d; return insert_unique(v, std::move(copy)); }

auto remove_unique(std::vector<Data>& v, Data const& d)
    -> std::pair<std::vector<Data>::iterator, bool>
{
    auto lb = std::lower_bound(std::begin(v), std::end(v), d, Comparator());

    if(lb != std::end(v) && lb->DataId == d.DataId)
        return {v.erase(lb), true};

    return {std::end(v), false};
}

int main()
{
    std::vector<Data> v;

    Data d{1, 2};

    insert_unique(v, d);

    // do a binary search
    auto lb = std::lower_bound(std::begin(v), std::end(v), d, Comparator());

    if(lb != std::end(v)) // was it found?
        lb->value = 9;
}

Of course that extra code can be wrapped up in a "uniquely sorted vector" class.

This answer does not give a direct answer but provides some hints towards a better performance for querying if it is a bottleneck for you. It also supposes that the collection is immutable, ie. you can look for items in the collection after every items have been put into.

Actually, std::set::find is logarithmic in the size of the container, but you can almost reach a lookup in constant time if the collection can't grow. The idea is to store all the Data objects in a std::vector (which should be optimal in terms of space) and to store the keys in a succinct data structure such as Elias Fano that provides rank/select operation in nearly constant time (and with a little extra memory usage explained in the link above).

Then, from one given key k, you can find what is the position of the corresponding entry (if any) in the vector, which can be done with rank/select operations.

Here is a sketch of a IndexedVector class that uses the succinct library:

template<typename T, auto getkey>
class IndexedVector
{
public:

    using key_type = decltype(getkey(T{}));

    template<typename Range>
    IndexedVector (Range range, key_type maxKey=std::numeric_limits<key_type>::max()) 
    {
        succinct::elias_fano::elias_fano_builder bvb (maxKey, range.size());

        // We build both the vector of items and the Elian Fano structure holding the keys
        for (auto const& item : range)
        {
            bvb.push_back (getkey(item)); 
            items_.push_back (item);
        }

        succinct::elias_fano tmp(&bvb);
        bitmap_.swap (tmp);
    }

    std::optional<T> find (key_type k)
    {
        if (bitmap_[k]) {  return items_[bitmap_.rank(k+1)-1]; }
        else            {  return {};                          }
    }

private:
    std::vector<T>       items_;   // holds the items
    succinct::elias_fano bitmap_;  // holds the bitset containing the keys
};

Demo

With this kind of approach, one can observe a speedup (compared to std::set::find) ranging from 3x to 10x in function of the number of items in the collection.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论