c++ - Is there a generalization of std::bitset for two-bit values? -
c++ - Is there a generalization of std::bitset for two-bit values? -
suppose genome scientist trying store extremely long strings of characters, each of represents 2 bits of info (i.e. each element either g, a, t, or c). because strings incredibly long, need able store string of length n in exactly 2n bits (or rather, n/4 bytes).
with motivation in mind, looking generalization of std::bitset (or boost::dynamic_bitset<>) works on two-bit values instead of single-bit values.  want store n such two-bit values, each of can 0, 1, 2, or 3.  need   info packed closely possible in memory, vector<char> not work (as wastes factor of 4 of memory).
what best way  accomplish goal?  1  alternative wrap existing bitset templates customized operator[], iterators, etc., i'd prefer  utilize existing library if @ possible.
you have 2 choices.
given:
enum class nucleobase { a, c, g, t };    you have 2 choices. can:
use singlestd::bitset , play indexing use std::bitset in combination container   for first, can define couple of functions target right number of bits per set/get:
template<std::size_t n> void set(std::bitset<n>& bits, std::size_t i, nucleobase x) {     switch (x) {         case nucleobase::a: bits.set(i * 2, 0); bits.set(i * 2 + 1, 0); break;         case nucleobase::c: bits.set(i * 2, 0); bits.set(i * 2 + 1, 1); break;         case nucleobase::g: bits.set(i * 2, 1); bits.set(i * 2 + 1, 0); break;         case nucleobase::t: bits.set(i * 2, 1); bits.set(i * 2 + 1, 1); break;     } }  template<std::size_t n> nucleobase get(const std::bitset<n>& bits, std::size_t i) {     if (!bits[i * 2])         if (!bits[i * 2 + 1])  homecoming nucleobase::a;         else                   homecoming nucleobase::c;     else         if (!bits[i * 2 + 1])  homecoming nucleobase::g;         else                   homecoming nucleobase::t; }    live demo
the above illustration , terrible 1 (it's 4am here , need sleep).
for sec need map alleles , bits:
bit_pair bits_for(nucleobase x) {     switch (x) {         case nucleobase::a:  homecoming bit_pair("00"); break;         case nucleobase::c:  homecoming bit_pair("10"); break;         case nucleobase::g:  homecoming bit_pair("01"); break;         case nucleobase::t:  homecoming bit_pair("11"); break;     } }  nucleobase nucleobase_for(bit_pair x) {     switch (x.to_ulong()) {         case 0:  homecoming nucleobase::a; break;         case 1:  homecoming nucleobase::c; break;         case 2:  homecoming nucleobase::g; break;         case 3:  homecoming nucleobase::t; break;         default:  homecoming nucleobase::a; break; // warning     } }    live demo
of  course of study if need runtime length can  utilize boost::dynamic_bitset , std::vector.
 c++ 
 
Comments
Post a Comment