Tpetra parallel linear algebra  Version of the Day
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Macros Pages
Tpetra_Details_FixedHashTable_def.hpp
1 /*
2 // @HEADER
3 // ***********************************************************************
4 //
5 // Tpetra: Templated Linear Algebra Services Package
6 // Copyright (2008) Sandia Corporation
7 //
8 // Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
9 // the U.S. Government retains certain rights in this software.
10 //
11 // Redistribution and use in source and binary forms, with or without
12 // modification, are permitted provided that the following conditions are
13 // met:
14 //
15 // 1. Redistributions of source code must retain the above copyright
16 // notice, this list of conditions and the following disclaimer.
17 //
18 // 2. Redistributions in binary form must reproduce the above copyright
19 // notice, this list of conditions and the following disclaimer in the
20 // documentation and/or other materials provided with the distribution.
21 //
22 // 3. Neither the name of the Corporation nor the names of the
23 // contributors may be used to endorse or promote products derived from
24 // this software without specific prior written permission.
25 //
26 // THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
27 // EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
28 // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
29 // PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
30 // CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
31 // EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
32 // PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
33 // PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
34 // LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
35 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
36 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
37 //
38 // Questions? Contact Michael A. Heroux (maherou@sandia.gov)
39 //
40 // ************************************************************************
41 // @HEADER
42 */
43 
44 #ifndef TPETRA_DETAILS_FIXEDHASHTABLE_DEF_HPP
45 #define TPETRA_DETAILS_FIXEDHASHTABLE_DEF_HPP
46 
47 #include "Tpetra_Details_FixedHashTable_decl.hpp"
49 #ifdef TPETRA_USE_MURMUR_HASH
50 # include "Kokkos_Functional.hpp" // hash function used by Kokkos::UnorderedMap
51 #endif // TPETRA_USE_MURMUR_HASH
52 #include "Kokkos_ArithTraits.hpp"
53 #include "Teuchos_TypeNameTraits.hpp"
55 #include <type_traits>
56 
57 namespace Tpetra {
58 namespace Details {
59 //
60 // This namespace stores utility functions and Kokkos
61 // functors for use in FixedHashTable construction.
62 //
63 namespace FHT {
64 
65 
66 // Is it worth actually using building the FixedHashTable using
67 // parallel threads, instead of just counting in a sequential loop?
68 //
69 // The parallel version of FixedHashTable construction isn't just a
70 // parallelization of the sequential loops. It incurs additional
71 // overheads. For example, the CountBuckets kernel uses atomic update
72 // instructions to count the number of "buckets" per offsets array
73 // (ptr) entry. Atomic updates have overhead, even if only one thread
74 // issues them. The Kokkos kernels are still correct in that case,
75 // but I would rather not incur overhead then. It might make sense to
76 // set the minimum number of threads to something greater than 1, but
77 // we would need experiments to find out.
78 template<class ExecSpace>
79 bool worthBuildingFixedHashTableInParallel () {
80 // KDD 8/19: A temporary fix for #5179. It appears some errors come from
81 // KDD 8/19: computeOffsetsFromCounts; this temporary patch avoids the most
82 // KDD 8/19: reproducible of the errors. We'll reverse this as we debug
83 // KDD 8/19: the problem, but this temporary fix may allow users to make
84 // KDD 8/19: progress toward their milestones.
85 //std::cout << "KDDKDD WorthIt " << (ExecSpace::concurrency() > 1) << std::endl;
86  return ExecSpace::concurrency() > 1;
87 // return false;
88 // KDD 8/19
89 }
90 
91 // If the input kokkos::View<const KeyType*, ArrayLayout,
92 // InputExecSpace, Kokkos::MemoryUnamanged> is NOT accessible from the
93 // OutputExecSpace execution space, make and return a deep copy.
94 // Otherwise, just return the original input.
95 //
96 // The point of this is to avoid unnecessary copies, when the input
97 // array of keys comes in as a Teuchos::ArrayView (which we wrap in an
98 // unmanaged Kokkos::View).
99 template<class KeyType,
100  class ArrayLayout,
101  class InputExecSpace,
102  class OutputExecSpace,
103  const bool mustDeepCopy =
104  ! std::is_same<typename InputExecSpace::memory_space,
105  typename OutputExecSpace::memory_space>::value>
106 struct DeepCopyIfNeeded {
107  // The default implementation is trivial; all the work happens in
108  // partial specializations.
109 };
110 
111 // Specialization for when a deep copy is actually needed.
112 template<class KeyType,
113  class ArrayLayout,
114  class InputExecSpace,
115  class OutputExecSpace>
116 struct DeepCopyIfNeeded<KeyType, ArrayLayout, InputExecSpace, OutputExecSpace, true>
117 {
118  typedef Kokkos::View<const KeyType*, ArrayLayout,
119  InputExecSpace, Kokkos::MemoryUnmanaged> input_view_type;
120  // In this case, a deep copy IS needed. As a result, the output
121  // type is a managed Kokkos::View, which differs from the input
122  // type. Clients must get the correct return type from this struct,
123  // either from the typedef below or from 'auto'. Assigning an
124  // unmanaged View to a managed View is a syntax error.
125  typedef Kokkos::View<const KeyType*, ArrayLayout, OutputExecSpace> output_view_type;
126 
127  static output_view_type copy (const input_view_type& src) {
128  typedef typename output_view_type::non_const_type NC;
129 
130  NC dst (Kokkos::ViewAllocateWithoutInitializing (src.tracker ().label ()),
131  src.extent (0));
132  Kokkos::deep_copy (dst, src);
133  return output_view_type (dst);
134  }
135 };
136 
137 // Specialization if no need to make a deep copy.
138 template<class KeyType,
139  class ArrayLayout,
140  class InputExecSpace,
141  class OutputExecSpace>
142 struct DeepCopyIfNeeded<KeyType, ArrayLayout, InputExecSpace, OutputExecSpace, false> {
143  typedef Kokkos::View<const KeyType*, ArrayLayout,
144  InputExecSpace, Kokkos::MemoryUnmanaged> input_view_type;
145  typedef Kokkos::View<const KeyType*, ArrayLayout, OutputExecSpace,
146  Kokkos::MemoryUnmanaged> output_view_type;
147 
148  static output_view_type copy (const input_view_type& src) {
149  return output_view_type (src);
150  }
151 };
152 
153 //
154 // Functors for FixedHashTable initialization
155 //
156 // NOTE (mfh 23 May 2015): Once we can use lambdas with CUDA, we
157 // should consider replacing all of these functors with in-line
158 // lambdas. The only issue is that we would need to bake the
159 // execution space into the policy, since the default execution space
160 // might differ from the one Tpetra wants to use.
161 
171 template<class CountsViewType,
172  class KeysViewType,
173  class SizeType = typename KeysViewType::size_type>
175 public:
176  typedef CountsViewType counts_view_type;
177  typedef KeysViewType keys_view_type;
178  typedef typename CountsViewType::execution_space execution_space;
179  typedef typename CountsViewType::memory_space memory_space;
180  typedef SizeType size_type;
181  typedef typename keys_view_type::non_const_value_type key_type;
182  // mfh 21 May 2015: Having a device_type typedef in the functor
183  // along with an execution_space typedef causes compilation issues.
184  // This is because one of Kokkos' partial specializations picks up
185  // on the device_type typedef, and another picks up on the
186  // execution_space typedef. The former is a legacy of a previous
187  // design iteration of Kokkos, which did not separate memory and
188  // execution spaces.
190 
196  CountBuckets (const counts_view_type& counts,
197  const keys_view_type& keys,
198  const size_type size) :
199  counts_ (counts),
200  keys_ (keys),
201  size_ (size)
202  {}
203 
208  KOKKOS_INLINE_FUNCTION void
209  operator () (const size_type& i) const
210  {
211  typedef typename hash_type::result_type hash_value_type;
212 
213  const hash_value_type hashVal = hash_type::hashFunc (keys_[i], size_);
214  Kokkos::atomic_increment (&counts_[hashVal]);
215  }
216 
217  using value_type = Kokkos::pair<int, key_type>;
218 
222  KOKKOS_INLINE_FUNCTION void
223  operator () (const size_type& i, value_type& dst) const
224  {
225  using hash_value_type = typename hash_type::result_type;
226 
227  const key_type keyVal = keys_[i];
228  const hash_value_type hashVal = hash_type::hashFunc (keyVal, size_);
229  if (hashVal < hash_value_type (0) ||
230  hashVal >= hash_value_type (counts_.extent (0))) {
231  dst.first = 1;
232  dst.second = keyVal;
233  }
234  else {
235  Kokkos::atomic_increment (&counts_[hashVal]);
236  }
237  }
238 
240  KOKKOS_INLINE_FUNCTION void init (value_type& dst) const
241  {
242  dst.first = 0;
243  dst.second = key_type (0);
244  }
245 
246  KOKKOS_INLINE_FUNCTION void
247  join (volatile value_type& dst,
248  const volatile value_type& src) const
249  {
250  const bool good = dst.first == 0 || src.first == 0;
251  dst.first = good ? 0 : dst.first;
252  // leave dst.second as it is, to get the "first" key
253  }
254 
255 private:
257  counts_view_type counts_;
259  keys_view_type keys_;
261  size_type size_;
262 };
263 
274 template<class KeyType>
276  KOKKOS_INLINE_FUNCTION
277  FillPairsResult () :
278  minKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
279  // min() for a floating-point type returns the minimum _positive_
280  // normalized value. This is different than for integer types.
281  // lowest() is new in C++11 and returns the least value, always
282  // negative for signed finite types.
283  //
284  // mfh 23 May 2015: I have heard reports that
285  // std::numeric_limits<int>::lowest() does not exist with the
286  // Intel compiler. I'm not sure if the users in question actually
287  // enabled C++11. However, it's easy enough to work around this
288  // issue. The standard floating-point types are signed and have a
289  // sign bit, so lowest() is just -max(). For integer types, we
290  // can use min() instead.
291  maxKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
292  ::Kokkos::Details::ArithTraits<KeyType>::min () :
293  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
294  success_ (true)
295  {
296  static_assert (std::is_arithmetic<KeyType>::value, "FillPairsResult: "
297  "KeyType must be some kind of number type.");
298  }
299 
300  KOKKOS_INLINE_FUNCTION
301  FillPairsResult (const KeyType& initMinKey,
302  const KeyType& initMaxKey) :
303  minKey_ (initMinKey),
304  maxKey_ (initMaxKey),
305  success_ (true)
306  {
307  static_assert (std::is_arithmetic<KeyType>::value, "FillPairsResult: "
308  "KeyType must be some kind of number type.");
309  }
310 
311  KeyType minKey_;
312  KeyType maxKey_;
313  bool success_;
314 };
315 
345 template<class PairsViewType,
346  class KeysViewType,
347  class CountsViewType,
348  class SizeType = typename KeysViewType::size_type>
349 class FillPairs {
350 public:
351  typedef typename CountsViewType::non_const_type counts_view_type;
352  typedef typename counts_view_type::const_type offsets_view_type;
353 
354  typedef PairsViewType pairs_view_type;
355  typedef typename KeysViewType::const_type keys_view_type;
356  typedef typename offsets_view_type::execution_space execution_space;
357  typedef typename offsets_view_type::memory_space memory_space;
358  typedef SizeType size_type;
359 
360  typedef typename keys_view_type::non_const_value_type key_type;
361  typedef typename pairs_view_type::non_const_value_type pair_type;
362 
364 
365  // mfh 23 May 2015: Having a device_type typedef in the functor
366  // along with an execution_space typedef causes compilation issues.
367  // This is because one of Kokkos' partial specializations picks up
368  // on the device_type typedef, and another picks up on the
369  // execution_space typedef. The former is a legacy of a previous
370  // design iteration of Kokkos, which did not separate memory and
371  // execution spaces.
373 
384  FillPairs (const pairs_view_type& pairs,
385  const counts_view_type& counts,
386  const offsets_view_type& ptr,
387  const keys_view_type& keys,
388  const typename pair_type::second_type startingValue) :
389  pairs_ (pairs),
390  counts_ (counts),
391  ptr_ (ptr),
392  keys_ (keys),
393  size_ (counts.extent (0)),
394  startingValue_ (startingValue),
395  initMinKey_ (::Kokkos::Details::ArithTraits<key_type>::max ()),
396  initMaxKey_ (::Kokkos::Details::ArithTraits<key_type>::is_integer ?
397  ::Kokkos::Details::ArithTraits<key_type>::min () :
398  -::Kokkos::Details::ArithTraits<key_type>::max ())
399  {}
400 
419  FillPairs (const pairs_view_type& pairs,
420  const counts_view_type& counts,
421  const offsets_view_type& ptr,
422  const keys_view_type& keys,
423  const typename pair_type::second_type startingValue,
424  const key_type initMinKey,
425  const key_type initMaxKey) :
426  pairs_ (pairs),
427  counts_ (counts),
428  ptr_ (ptr),
429  keys_ (keys),
430  size_ (counts.extent (0)),
431  startingValue_ (startingValue),
432  initMinKey_ (initMinKey),
433  initMaxKey_ (initMaxKey)
434  {}
435 
437  KOKKOS_INLINE_FUNCTION void init (value_type& dst) const
438  {
439  dst.minKey_ = initMinKey_;
440  dst.maxKey_ = initMaxKey_;
441  dst.success_ = true;
442  }
443 
444  KOKKOS_INLINE_FUNCTION void
445  join (volatile value_type& dst,
446  const volatile value_type& src) const
447  {
448  if (src.maxKey_ > dst.maxKey_) {
449  dst.maxKey_ = src.maxKey_;
450  }
451  if (src.minKey_ < dst.minKey_) {
452  dst.minKey_ = src.minKey_;
453  }
454  dst.success_ = dst.success_ && src.success_;
455  }
456 
461  KOKKOS_INLINE_FUNCTION void
462  operator () (const size_type& i, value_type& dst) const
463  {
464  typedef typename hash_type::result_type hash_value_type;
465  typedef typename offsets_view_type::non_const_value_type offset_type;
466  typedef typename pair_type::second_type val_type;
467  typedef typename std::remove_reference< decltype( counts_[0] ) >::type atomic_incr_type;
468 
469  const key_type key = keys_[i];
470  if (key > dst.maxKey_) {
471  dst.maxKey_ = key;
472  }
473  if (key < dst.minKey_) {
474  dst.minKey_ = key;
475  }
476  const val_type theVal = startingValue_ + static_cast<val_type> (i);
477  const hash_value_type hashVal = hash_type::hashFunc (key, size_);
478 
479  // Return the old count; decrement afterwards.
480  const offset_type count = Kokkos::atomic_fetch_add (&counts_[hashVal], atomic_incr_type(-1));
481  if (count == 0) {
482  dst.success_ = false; // FAILURE!
483  }
484  else {
485  const offset_type curPos = ptr_[hashVal+1] - count;
486 
487  pairs_[curPos].first = key;
488  pairs_[curPos].second = theVal;
489  }
490  }
491 
492 private:
493  pairs_view_type pairs_;
494  counts_view_type counts_;
495  offsets_view_type ptr_;
496  keys_view_type keys_;
497  size_type size_;
498  typename pair_type::second_type startingValue_;
500  key_type initMinKey_;
502  key_type initMaxKey_;
503 };
504 
528 template<class OffsetsViewType,
529  class PairsViewType,
530  class SizeType = typename OffsetsViewType::size_type>
532 public:
533  typedef typename OffsetsViewType::const_type offsets_view_type;
534  typedef typename PairsViewType::const_type pairs_view_type;
535  typedef typename offsets_view_type::execution_space execution_space;
536  typedef typename offsets_view_type::memory_space memory_space;
537  typedef SizeType size_type;
538 
539  // The result of the check is whether the table has one or more duplicates.
540  typedef int value_type;
541 
546  CheckForDuplicateKeys (const pairs_view_type& pairs,
547  const offsets_view_type& ptr) :
548  pairs_ (pairs),
549  ptr_ (ptr),
550  size_ (ptr_.extent (0) == 0 ?
551  size_type (0) :
552  ptr_.extent (0) - 1)
553  {}
554 
556  KOKKOS_INLINE_FUNCTION void init (value_type& dst) const
557  {
558  dst = 0;
559  }
560 
562  KOKKOS_INLINE_FUNCTION void
563  join (volatile value_type& dst,
564  const volatile value_type& src) const
565  {
566  dst = dst + src > 0?1:0;
567  }
568 
570  KOKKOS_INLINE_FUNCTION void
571  operator () (const size_type& i, value_type& dst) const
572  {
573  typedef typename offsets_view_type::non_const_value_type offset_type;
574  typedef typename pairs_view_type::non_const_value_type pair_type;
575  typedef typename pair_type::first_type key_type;
576 
577  if (dst>0) {
578  return; // we've already found duplicate keys elsewhere
579  }
580  else {
581  const offset_type beg = ptr_[i];
582  const offset_type end = ptr_[i+1];
583  bool foundDuplicateKey = false;
584  // This is an ~ n^2 algorithm in the worst case, where n is the
585  // max number of keys that hash to the same bucket. However, if
586  // the hash function is reasonable, n should be much less than
587  // the total number of keys.
588  for (offset_type j = beg + 1; j < end; ++j) {
589  const key_type curKey = pairs_[j].first;
590  for (offset_type k = beg; k < j; ++k) {
591  if (pairs_[k].first == curKey) {
592  foundDuplicateKey = true;
593  break;
594  }
595  }
596  }
597  dst = (dst>0) || foundDuplicateKey?1:0;
598  }
599  }
600 
601 private:
602  pairs_view_type pairs_;
603  offsets_view_type ptr_;
604  size_type size_;
605 };
606 
607 } // namespace FHT
608 
609 //
610 // Here begins the actual implementation of FixedHashTable.
611 //
612 
613 template<class KeyType, class ValueType, class DeviceType>
614 bool
616 hasKeys () const {
617  // This also works if FixedHashTable has no entries. getKey()
618  // works in that case, but always returns the flag (invalid).
619  //
620  // FIXME (31 May 2015) This only works because vals_ contains no
621  // padding. If we ever pad within a "row" of vals_, we'll have to
622  // change this.
623  return keys_.extent (0) == val_.extent (0);
624 }
625 
626 template<class KeyType, class ValueType, class DeviceType>
627 void
629 check () const
630 {
631  // const char prefix[] = "Tpetra::Details::FixedHashTable: ";
632  // const char suffix[] = " Please report this bug to the Tpetra developers.";
633 }
634 
635 template<class KeyType, class ValueType, class DeviceType>
638  minKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
639  maxKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
640  ::Kokkos::Details::ArithTraits<KeyType>::min () :
641  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
642  minVal_ (::Kokkos::Details::ArithTraits<ValueType>::max ()),
643  maxVal_ (::Kokkos::Details::ArithTraits<ValueType>::is_integer ?
644  ::Kokkos::Details::ArithTraits<ValueType>::min () :
645  -::Kokkos::Details::ArithTraits<ValueType>::max ()),
646  firstContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
647  lastContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
648  ::Kokkos::Details::ArithTraits<KeyType>::min () :
649  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
650  contiguousValues_ (true), // trivially
651  checkedForDuplicateKeys_ (true), // it's an empty table; no need to check
652  hasDuplicateKeys_ (false)
653 {
654 #ifdef HAVE_TPETRA_DEBUG
655  check ();
656 #endif // HAVE_TPETRA_DEBUG
657 }
658 
659 template<class KeyType, class ValueType, class DeviceType>
661 FixedHashTable (const keys_type& keys) :
662  keys_ (keys),
663  minKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()), // to be set in init()
664  maxKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
665  ::Kokkos::Details::ArithTraits<KeyType>::min () :
666  -::Kokkos::Details::ArithTraits<KeyType>::max ()), // to be set in init()
667  minVal_ (0),
668  maxVal_ (keys.size () == 0 ?
669  static_cast<ValueType> (0) :
670  static_cast<ValueType> (keys.size () - 1)),
671  firstContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
672  lastContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
673  ::Kokkos::Details::ArithTraits<KeyType>::min () :
674  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
675  contiguousValues_ (true),
676  checkedForDuplicateKeys_ (false),
677  hasDuplicateKeys_ (false) // to revise in hasDuplicateKeys()
678 {
679  const ValueType startingValue = static_cast<ValueType> (0);
680  const KeyType initMinKey = this->minKey_;
681  const KeyType initMaxKey = this->maxKey_;
682  this->init (keys, startingValue, initMinKey, initMaxKey,
683  initMinKey, initMinKey, false);
684 
685 #ifdef HAVE_TPETRA_DEBUG
686  check ();
687 #endif // HAVE_TPETRA_DEBUG
688 }
689 
690 template<class KeyType, class ValueType, class DeviceType>
692 FixedHashTable (const Teuchos::ArrayView<const KeyType>& keys,
693  const bool keepKeys) :
694  minKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()), // to be set in init()
695  maxKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
696  ::Kokkos::Details::ArithTraits<KeyType>::min () :
697  -::Kokkos::Details::ArithTraits<KeyType>::max ()), // to be set in init()
698  minVal_ (0),
699  maxVal_ (keys.size () == 0 ?
700  static_cast<ValueType> (0) :
701  static_cast<ValueType> (keys.size () - 1)),
702  firstContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
703  lastContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
704  ::Kokkos::Details::ArithTraits<KeyType>::min () :
705  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
706  contiguousValues_ (true),
707  checkedForDuplicateKeys_ (false),
708  hasDuplicateKeys_ (false) // to revise in hasDuplicateKeys()
709 {
710  typedef typename keys_type::non_const_type nonconst_keys_type;
711 
712  // mfh 01 May 2015: I don't trust that
713  // Teuchos::ArrayView::getRawPtr() returns NULL when the size is 0,
714  // so I ensure this manually.
715  const ValueType startingValue = static_cast<ValueType> (0);
716  host_input_keys_type keys_k (keys.size () == 0 ? NULL : keys.getRawPtr (),
717  keys.size ());
718  using Kokkos::ViewAllocateWithoutInitializing;
719  nonconst_keys_type keys_d (ViewAllocateWithoutInitializing ("FixedHashTable::keys"),
720  keys_k.extent (0));
721  Kokkos::deep_copy (keys_d, keys_k);
722  const KeyType initMinKey = this->minKey_;
723  const KeyType initMaxKey = this->maxKey_;
724  this->init (keys_d, startingValue, initMinKey, initMaxKey,
725  initMinKey, initMinKey, false);
726  if (keepKeys) {
727  keys_ = keys_d;
728 #ifdef HAVE_TPETRA_DEBUG
729  typedef typename keys_type::size_type size_type;
730  TEUCHOS_TEST_FOR_EXCEPTION
731  (keys_.extent (0) != static_cast<size_type> (keys.size ()),
732  std::logic_error, "Tpetra::Details::FixedHashTable constructor: "
733  "keepKeys is true, but on return, keys_.extent(0) = " <<
734  keys_.extent (0) << " != keys.size() = " << keys.size () <<
735  ". Please report this bug to the Tpetra developers.");
736 #endif // HAVE_TPETRA_DEBUG
737  }
738 
739 #ifdef HAVE_TPETRA_DEBUG
740  check ();
741 #endif // HAVE_TPETRA_DEBUG
742 }
743 
744 template<class KeyType, class ValueType, class DeviceType>
746 FixedHashTable (const Teuchos::ArrayView<const KeyType>& keys,
747  const ValueType startingValue,
748  const bool keepKeys) :
749  minKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
750  maxKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
751  ::Kokkos::Details::ArithTraits<KeyType>::min () :
752  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
753  minVal_ (startingValue),
754  maxVal_ (keys.size () == 0 ?
755  startingValue :
756  static_cast<ValueType> (startingValue + keys.size () - 1)),
757  firstContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
758  lastContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
759  ::Kokkos::Details::ArithTraits<KeyType>::min () :
760  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
761  contiguousValues_ (true),
762  checkedForDuplicateKeys_ (false),
763  hasDuplicateKeys_ (false) // to revise in hasDuplicateKeys()
764 {
765  typedef typename keys_type::non_const_type nonconst_keys_type;
766 
767  // mfh 01 May 2015: I don't trust that
768  // Teuchos::ArrayView::getRawPtr() returns NULL when the size is 0,
769  // so I ensure this manually.
770  host_input_keys_type keys_k (keys.size () == 0 ? NULL : keys.getRawPtr (),
771  keys.size ());
772  using Kokkos::ViewAllocateWithoutInitializing;
773  nonconst_keys_type keys_d (ViewAllocateWithoutInitializing ("FixedHashTable::keys"),
774  keys_k.extent (0));
775  Kokkos::deep_copy (keys_d, keys_k);
776 
777  const KeyType initMinKey = ::Kokkos::Details::ArithTraits<KeyType>::max ();
778  // min() for a floating-point type returns the minimum _positive_
779  // normalized value. This is different than for integer types.
780  // lowest() is new in C++11 and returns the least value, always
781  // negative for signed finite types.
782  //
783  // mfh 23 May 2015: I have heard reports that
784  // std::numeric_limits<int>::lowest() does not exist with the Intel
785  // compiler. I'm not sure if the users in question actually enabled
786  // C++11. However, it's easy enough to work around this issue. The
787  // standard floating-point types are signed and have a sign bit, so
788  // lowest() is just -max(). For integer types, we can use min()
789  // instead.
790  const KeyType initMaxKey = ::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
791  ::Kokkos::Details::ArithTraits<KeyType>::min () :
792  -::Kokkos::Details::ArithTraits<KeyType>::max ();
793  this->init (keys_d, startingValue, initMinKey, initMaxKey,
794  initMinKey, initMinKey, false);
795  if (keepKeys) {
796  keys_ = keys_d;
797 #ifdef HAVE_TPETRA_DEBUG
798  typedef typename keys_type::size_type size_type;
799  TEUCHOS_TEST_FOR_EXCEPTION
800  (keys_.extent (0) != static_cast<size_type> (keys.size ()),
801  std::logic_error, "Tpetra::Details::FixedHashTable constructor: "
802  "keepKeys is true, but on return, keys_.extent(0) = " <<
803  keys_.extent (0) << " != keys.size() = " << keys.size () <<
804  ". Please report this bug to the Tpetra developers.");
805 #endif // HAVE_TPETRA_DEBUG
806  }
807 
808 #ifdef HAVE_TPETRA_DEBUG
809  check ();
810 #endif // HAVE_TPETRA_DEBUG
811 }
812 
813 template<class KeyType, class ValueType, class DeviceType>
816  const KeyType firstContigKey,
817  const KeyType lastContigKey,
818  const ValueType startingValue,
819  const bool keepKeys) :
820  minKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
821  maxKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
822  ::Kokkos::Details::ArithTraits<KeyType>::min () :
823  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
824  minVal_ (startingValue),
825  maxVal_ (keys.size () == 0 ?
826  startingValue :
827  static_cast<ValueType> (startingValue + keys.size () - 1)),
828  firstContigKey_ (firstContigKey),
829  lastContigKey_ (lastContigKey),
830  contiguousValues_ (true),
831  checkedForDuplicateKeys_ (false),
832  hasDuplicateKeys_ (false) // to revise in hasDuplicateKeys()
833 {
834  const KeyType initMinKey = ::Kokkos::Details::ArithTraits<KeyType>::max ();
835  // min() for a floating-point type returns the minimum _positive_
836  // normalized value. This is different than for integer types.
837  // lowest() is new in C++11 and returns the least value, always
838  // negative for signed finite types.
839  //
840  // mfh 23 May 2015: I have heard reports that
841  // std::numeric_limits<int>::lowest() does not exist with the Intel
842  // compiler. I'm not sure if the users in question actually enabled
843  // C++11. However, it's easy enough to work around this issue. The
844  // standard floating-point types are signed and have a sign bit, so
845  // lowest() is just -max(). For integer types, we can use min()
846  // instead.
847  const KeyType initMaxKey = ::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
848  ::Kokkos::Details::ArithTraits<KeyType>::min () :
849  -::Kokkos::Details::ArithTraits<KeyType>::max ();
850  this->init (keys, startingValue, initMinKey, initMaxKey,
851  firstContigKey, lastContigKey, true);
852  if (keepKeys) {
853  keys_ = keys;
854 #ifdef HAVE_TPETRA_DEBUG
855  typedef typename keys_type::size_type size_type;
856  TEUCHOS_TEST_FOR_EXCEPTION
857  (keys_.extent (0) != static_cast<size_type> (keys.size ()),
858  std::logic_error, "Tpetra::Details::FixedHashTable constructor: "
859  "keepKeys is true, but on return, keys_.extent(0) = " <<
860  keys_.extent (0) << " != keys.size() = " << keys.size () <<
861  ". Please report this bug to the Tpetra developers.");
862 #endif // HAVE_TPETRA_DEBUG
863  }
864 
865 #ifdef HAVE_TPETRA_DEBUG
866  check ();
867 #endif // HAVE_TPETRA_DEBUG
868 }
869 
870 template<class KeyType, class ValueType, class DeviceType>
872 FixedHashTable (const Teuchos::ArrayView<const KeyType>& keys,
873  const KeyType firstContigKey,
874  const KeyType lastContigKey,
875  const ValueType startingValue,
876  const bool keepKeys) :
877  minKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
878  maxKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
879  ::Kokkos::Details::ArithTraits<KeyType>::min () :
880  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
881  minVal_ (startingValue),
882  maxVal_ (keys.size () == 0 ?
883  startingValue :
884  static_cast<ValueType> (startingValue + keys.size () - 1)),
885  firstContigKey_ (firstContigKey),
886  lastContigKey_ (lastContigKey),
887  contiguousValues_ (true),
888  checkedForDuplicateKeys_ (false),
889  hasDuplicateKeys_ (false) // to revise in hasDuplicateKeys()
890 {
891  typedef typename keys_type::non_const_type nonconst_keys_type;
892 
893  // mfh 01 May 2015: I don't trust that
894  // Teuchos::ArrayView::getRawPtr() returns NULL when the size is 0,
895  // so I ensure this manually.
896  host_input_keys_type keys_k (keys.size () == 0 ? NULL : keys.getRawPtr (),
897  keys.size ());
898  using Kokkos::ViewAllocateWithoutInitializing;
899  nonconst_keys_type keys_d (ViewAllocateWithoutInitializing ("FixedHashTable::keys"),
900  keys_k.extent (0));
901  Kokkos::deep_copy (keys_d, keys_k);
902 
903  const KeyType initMinKey = ::Kokkos::Details::ArithTraits<KeyType>::max ();
904  // min() for a floating-point type returns the minimum _positive_
905  // normalized value. This is different than for integer types.
906  // lowest() is new in C++11 and returns the least value, always
907  // negative for signed finite types.
908  //
909  // mfh 23 May 2015: I have heard reports that
910  // std::numeric_limits<int>::lowest() does not exist with the Intel
911  // compiler. I'm not sure if the users in question actually enabled
912  // C++11. However, it's easy enough to work around this issue. The
913  // standard floating-point types are signed and have a sign bit, so
914  // lowest() is just -max(). For integer types, we can use min()
915  // instead.
916  const KeyType initMaxKey = ::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
917  ::Kokkos::Details::ArithTraits<KeyType>::min () :
918  -::Kokkos::Details::ArithTraits<KeyType>::max ();
919  this->init (keys_d, startingValue, initMinKey, initMaxKey,
920  firstContigKey, lastContigKey, true);
921  if (keepKeys) {
922  keys_ = keys_d;
923 #ifdef HAVE_TPETRA_DEBUG
924  typedef typename keys_type::size_type size_type;
925  TEUCHOS_TEST_FOR_EXCEPTION
926  (keys_.extent (0) != static_cast<size_type> (keys.size ()),
927  std::logic_error, "Tpetra::Details::FixedHashTable constructor: "
928  "keepKeys is true, but on return, keys_.extent(0) = " <<
929  keys_.extent (0) << " != keys.size() = " << keys.size () <<
930  ". Please report this bug to the Tpetra developers.");
931 #endif // HAVE_TPETRA_DEBUG
932  }
933 
934 #ifdef HAVE_TPETRA_DEBUG
935  check ();
936 #endif // HAVE_TPETRA_DEBUG
937 }
938 
939 template<class KeyType, class ValueType, class DeviceType>
942  const ValueType startingValue) :
943  keys_ (keys),
944  minKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
945  maxKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
946  ::Kokkos::Details::ArithTraits<KeyType>::min () :
947  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
948  minVal_ (startingValue),
949  maxVal_ (keys.size () == 0 ?
950  startingValue :
951  static_cast<ValueType> (startingValue + keys.size () - 1)),
952  firstContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
953  lastContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
954  ::Kokkos::Details::ArithTraits<KeyType>::min () :
955  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
956  contiguousValues_ (true),
957  checkedForDuplicateKeys_ (false),
958  hasDuplicateKeys_ (false) // to revise in hasDuplicateKeys()
959 {
960  const KeyType initMinKey = ::Kokkos::Details::ArithTraits<KeyType>::max ();
961  // min() for a floating-point type returns the minimum _positive_
962  // normalized value. This is different than for integer types.
963  // lowest() is new in C++11 and returns the least value, always
964  // negative for signed finite types.
965  //
966  // mfh 23 May 2015: I have heard reports that
967  // std::numeric_limits<int>::lowest() does not exist with the Intel
968  // compiler. I'm not sure if the users in question actually enabled
969  // C++11. However, it's easy enough to work around this issue. The
970  // standard floating-point types are signed and have a sign bit, so
971  // lowest() is just -max(). For integer types, we can use min()
972  // instead.
973  const KeyType initMaxKey = ::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
974  ::Kokkos::Details::ArithTraits<KeyType>::min () :
975  -::Kokkos::Details::ArithTraits<KeyType>::max ();
976  this->init (keys, startingValue, initMinKey, initMaxKey,
977  initMinKey, initMinKey, false);
978 
979 #ifdef HAVE_TPETRA_DEBUG
980  check ();
981 #endif // HAVE_TPETRA_DEBUG
982 }
983 
984 template<class KeyType, class ValueType, class DeviceType>
986 FixedHashTable (const Teuchos::ArrayView<const KeyType>& keys,
987  const Teuchos::ArrayView<const ValueType>& vals) :
988  minKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
989  maxKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
990  ::Kokkos::Details::ArithTraits<KeyType>::min () :
991  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
992  minVal_ (::Kokkos::Details::ArithTraits<ValueType>::max ()),
993  maxVal_ (::Kokkos::Details::ArithTraits<ValueType>::is_integer ?
994  ::Kokkos::Details::ArithTraits<ValueType>::min () :
995  -::Kokkos::Details::ArithTraits<ValueType>::max ()),
996  firstContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::max ()),
997  lastContigKey_ (::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
998  ::Kokkos::Details::ArithTraits<KeyType>::min () :
999  -::Kokkos::Details::ArithTraits<KeyType>::max ()),
1000  contiguousValues_ (false),
1001  checkedForDuplicateKeys_ (false),
1002  hasDuplicateKeys_ (false) // to revise in hasDuplicateKeys()
1003 {
1004  // mfh 01 May 2015: I don't trust that
1005  // Teuchos::ArrayView::getRawPtr() returns NULL when the size is 0,
1006  // so I ensure this manually.
1007  host_input_keys_type keys_k (keys.size () == 0 ? NULL : keys.getRawPtr (),
1008  keys.size ());
1009  host_input_vals_type vals_k (vals.size () == 0 ? NULL : vals.getRawPtr (),
1010  vals.size ());
1011  const KeyType initMinKey = ::Kokkos::Details::ArithTraits<KeyType>::max ();
1012  // min() for a floating-point type returns the minimum _positive_
1013  // normalized value. This is different than for integer types.
1014  // lowest() is new in C++11 and returns the least value, always
1015  // negative for signed finite types.
1016  //
1017  // mfh 23 May 2015: I have heard reports that
1018  // std::numeric_limits<int>::lowest() does not exist with the Intel
1019  // compiler. I'm not sure if the users in question actually enabled
1020  // C++11. However, it's easy enough to work around this issue. The
1021  // standard floating-point types are signed and have a sign bit, so
1022  // lowest() is just -max(). For integer types, we can use min()
1023  // instead.
1024  const KeyType initMaxKey = ::Kokkos::Details::ArithTraits<KeyType>::is_integer ?
1025  ::Kokkos::Details::ArithTraits<KeyType>::min () :
1026  -::Kokkos::Details::ArithTraits<KeyType>::max ();
1027  this->init (keys_k, vals_k, initMinKey, initMaxKey);
1028 
1029 #ifdef HAVE_TPETRA_DEBUG
1030  check ();
1031 #endif // HAVE_TPETRA_DEBUG
1032 }
1033 
1034 template<class KeyType, class ValueType, class DeviceType>
1035 void
1037 init (const keys_type& keys,
1038  ValueType startingValue,
1039  KeyType initMinKey,
1040  KeyType initMaxKey,
1041  KeyType firstContigKey,
1042  KeyType lastContigKey,
1043  const bool computeInitContigKeys)
1044 {
1045  using Kokkos::subview;
1046  using Kokkos::ViewAllocateWithoutInitializing;
1047  using Teuchos::TypeNameTraits;
1048  typedef typename std::decay<decltype (keys.extent (0)) >::type size_type;
1049  const char prefix[] = "Tpetra::Details::FixedHashTable: ";
1050 
1051  const offset_type numKeys = static_cast<offset_type> (keys.extent (0));
1052  {
1053  const offset_type theMaxVal = ::Kokkos::Details::ArithTraits<offset_type>::max ();
1054  const size_type maxValST = static_cast<size_type> (theMaxVal);
1055  TEUCHOS_TEST_FOR_EXCEPTION
1056  (keys.extent (0) > maxValST, std::invalid_argument, prefix << "The "
1057  "number of keys " << keys.extent (0) << " does not fit in "
1058  "offset_type = " << TypeNameTraits<offset_type>::name () << ", whose "
1059  "max value is " << theMaxVal << ". This means that it is not possible to "
1060  "use this constructor.");
1061  }
1062  TEUCHOS_TEST_FOR_EXCEPTION
1063  (static_cast<unsigned long long> (numKeys) >
1064  static_cast<unsigned long long> (::Kokkos::Details::ArithTraits<ValueType>::max ()),
1065  std::invalid_argument, "Tpetra::Details::FixedHashTable: The number of "
1066  "keys " << numKeys << " is greater than the maximum representable "
1067  "ValueType value " << ::Kokkos::Details::ArithTraits<ValueType>::max () << ". "
1068  "This means that it is not possible to use this constructor.");
1069  TEUCHOS_TEST_FOR_EXCEPTION
1070  (numKeys > static_cast<offset_type> (INT_MAX), std::logic_error, prefix <<
1071  "This class currently only works when the number of keys is <= INT_MAX = "
1072  << INT_MAX << ". If this is a problem for you, please talk to the Tpetra "
1073  "developers.");
1074 
1075  const bool buildInParallel =
1076  FHT::worthBuildingFixedHashTableInParallel<execution_space> ();
1077  const bool debug = ::Tpetra::Details::Behavior::debug ();
1078 
1079  // NOTE (mfh 14 May 2015) This method currently assumes UVM. We
1080  // could change that by setting up ptr and val as Kokkos::DualView
1081  // instances. If we do that, since we are filling on host for now,
1082  // we want to make sure that we only zero-fill ptr on host
1083  // initially, and that we don't fill val at all. Once we finish
1084  // Kokkos-izing all the set-up kernels, we won't need DualView for
1085  // either ptr or val.
1086 
1087  if (computeInitContigKeys) {
1088  // Find the first and last initial contiguous keys. If we find a
1089  // long sequence of initial contiguous keys, we can save space by
1090  // not storing them explicitly as pairs in the hash table.
1091  //
1092  // NOTE (mfh 01 Jun 2015) Doing this in parallel requires a scan
1093  // ("min index such that the difference between the current key and
1094  // the next != 1"), which takes multiple passes over the data. We
1095  // could fuse it with CountBuckets (only update counts on 'final'
1096  // pass). However, we're really just moving this sequential search
1097  // out of Map's constructor here, so there is no loss in doing it
1098  // sequentially for now. Later, we can work on parallelization.
1099  //
1100  // NOTE (mfh 01 Jun 2015, 28 Mar 2016) The code below assumes UVM.
1101  // It is rational to assume UVM here, because this is "sparse"
1102  // access -- we might not need to look at all the entries of keys,
1103  // so it doesn't necessarily make sense to copy the whole thing
1104  // back to host.
1105  if (numKeys > 0) {
1106  firstContigKey_ = keys[0];
1107  // Start with one plus, then decrement at the end. That lets us do
1108  // only one addition per loop iteration, rather than two (if we test
1109  // against lastContigKey + 1 and then increment lastContigKey).
1110  lastContigKey_ = firstContigKey_ + 1;
1111 
1112  // We will only store keys in the table that are not part of the
1113  // initial contiguous sequence. It's possible for the initial
1114  // contiguous sequence to be trivial, which for a nonzero number of
1115  // keys means that the "sequence" has length 1.
1116  for (offset_type k = 1; k < numKeys; ++k) {
1117  if (lastContigKey_ != keys[k]) {
1118  break;
1119  }
1120  ++lastContigKey_;
1121  }
1122  --lastContigKey_;
1123  }
1124  }
1125  else {
1126  firstContigKey_ = firstContigKey;
1127  lastContigKey_ = lastContigKey;
1128  }
1129 
1130  offset_type startIndex;
1131  if (numKeys > 0) {
1132  initMinKey = std::min (initMinKey, firstContigKey_);
1133  initMaxKey = std::max (initMaxKey, lastContigKey_);
1134  startIndex = static_cast<offset_type> (lastContigKey_ - firstContigKey_);
1135  } else {
1136  startIndex = 0;
1137  }
1138 
1139  const offset_type theNumKeys = numKeys - startIndex;
1140  const offset_type size = hash_type::getRecommendedSize (theNumKeys);
1141 #ifdef HAVE_TPETRA_DEBUG
1142  TEUCHOS_TEST_FOR_EXCEPTION(
1143  size == 0 && numKeys != 0, std::logic_error,
1144  "Tpetra::Details::FixedHashTable constructor: "
1145  "getRecommendedSize(" << numKeys << ") returned zero, "
1146  "even though the number of keys " << numKeys << " is nonzero. "
1147  "Please report this bug to the Tpetra developers.");
1148 #endif // HAVE_TPETRA_DEBUG
1149  keys_type theKeys =
1150  subview (keys, std::pair<offset_type, offset_type> (startIndex, numKeys));
1151 
1152  // FIXME (mfh 28 Mar 2016) For some reason, we don't seem to need a
1153  // fence here, but we do before other allocations.
1154 
1155  // The array of counts must be separate from the array of offsets,
1156  // in order for parallel_scan to work correctly.
1157  typedef typename ptr_type::non_const_type counts_type;
1158  counts_type counts ("FixedHashTable::counts", size);
1159 
1160  //
1161  // Count the number of "buckets" per offsets array (ptr) entry.
1162  //
1163 
1164  // The Kokkos kernel uses atomic update instructions to count the
1165  // number of "buckets" per offsets array (ptr) entry. Atomic
1166  // updates incur overhead, even in the sequential case. The Kokkos
1167  // kernel is still correct in that case, but I would rather not
1168  // incur overhead then.
1169  if (buildInParallel) {
1170  FHT::CountBuckets<counts_type, keys_type> functor (counts, theKeys, size);
1171  using execution_space = typename counts_type::execution_space;
1172  using range_type = Kokkos::RangePolicy<execution_space, offset_type>;
1173  const char kernelLabel[] = "Tpetra::Details::FixedHashTable CountBuckets";
1174  if (debug) {
1175  using key_type = typename keys_type::non_const_value_type;
1176  Kokkos::pair<int, key_type> err;
1177  Kokkos::parallel_reduce (kernelLabel, range_type (0, theNumKeys),
1178  functor, err);
1179  TEUCHOS_TEST_FOR_EXCEPTION
1180  (err.first != 0, std::logic_error, "Tpetra::Details::FixedHashTable "
1181  "constructor: CountBuckets found a key " << err.second << " that "
1182  "results in an out-of-bounds hash value.");
1183  }
1184  else {
1185  Kokkos::parallel_for (kernelLabel, range_type (0, theNumKeys), functor);
1186  }
1187  }
1188  else {
1189  // Access to counts is not necessarily contiguous, but is
1190  // irregular and likely touches all pages of the array. Thus, it
1191  // probably makes sense to use a host copy explicitly, rather than
1192  // assume UVM.
1193  auto countsHost = Kokkos::create_mirror_view (counts);
1194  // FIXME (mfh 28 Mar 2016) Does create_mirror_view zero-fill?
1195  Kokkos::deep_copy (countsHost, static_cast<offset_type> (0));
1196 
1197  for (offset_type k = 0; k < theNumKeys; ++k) {
1198  using key_type = typename keys_type::non_const_value_type;
1199  const key_type key = theKeys[k];
1200 
1201  using hash_value_type = typename hash_type::result_type;
1202  const hash_value_type hashVal = hash_type::hashFunc (key, size);
1203  TEUCHOS_TEST_FOR_EXCEPTION
1204  (hashVal < hash_value_type (0) ||
1205  hashVal >= hash_value_type (countsHost.extent (0)),
1206  std::logic_error, "Tpetra::Details::FixedHashTable "
1207  "constructor: Sequential CountBuckets found a key " << key
1208  << " that results in an out-of-bounds hash value.");
1209 
1210  ++countsHost[hashVal];
1211  }
1212  Kokkos::deep_copy (counts, countsHost);
1213  }
1214 
1215  // FIXME (mfh 28 Mar 2016) Need a fence here, otherwise SIGSEGV w/
1216  // CUDA when ptr is filled.
1217  execution_space().fence ();
1218 
1219  // Kokkos::View fills with zeros by default.
1220  typename ptr_type::non_const_type ptr ("FixedHashTable::ptr", size+1);
1221 
1222  // Compute row offsets via prefix sum:
1223  //
1224  // ptr[i+1] = \sum_{j=0}^{i} counts[j].
1225  //
1226  // Thus, ptr[i+1] - ptr[i] = counts[i], so that ptr[i+1] = ptr[i] +
1227  // counts[i]. If we stored counts[i] in ptr[i+1] on input, then the
1228  // formula is ptr[i+1] += ptr[i].
1229  //
1230  // parallel_scan does not incur overhead with Kokkos::Serial, but
1231  // with actual parallel execution spaces, it does require multiple
1232  // passes over the data. Thus, it still makes sense to have a
1233  // sequential fall-back.
1234 
1236  if (buildInParallel) {
1237  computeOffsetsFromCounts (ptr, counts);
1238  }
1239 
1240  if (! buildInParallel || debug) {
1241  Kokkos::HostSpace hostMemSpace;
1242  auto counts_h = Kokkos::create_mirror_view (hostMemSpace, counts);
1243  Kokkos::deep_copy (counts_h, counts);
1244  auto ptr_h = Kokkos::create_mirror_view (hostMemSpace, ptr);
1245 
1246 #ifdef KOKKOS_ENABLE_SERIAL
1247  Kokkos::Serial hostExecSpace;
1248 #else
1249  Kokkos::DefaultHostExecutionSpace hostExecSpace;
1250 #endif // KOKKOS_ENABLE_SERIAL
1251 
1252  computeOffsetsFromCounts (hostExecSpace, ptr_h, counts_h);
1253  Kokkos::deep_copy (ptr, ptr_h);
1254 
1255  if (debug) {
1256  bool bad = false;
1257  for (offset_type i = 0; i < size; ++i) {
1258  if (ptr_h[i+1] != ptr_h[i] + counts_h[i]) {
1259  bad = true;
1260  }
1261  }
1262  TEUCHOS_TEST_FOR_EXCEPTION
1263  (bad, std::logic_error, "Tpetra::Details::FixedHashTable "
1264  "constructor: computeOffsetsFromCounts gave an incorrect "
1265  "result.");
1266  }
1267  }
1268 
1269  // FIXME (mfh 28 Mar 2016) Need a fence here, otherwise SIGSEGV w/
1270  // CUDA when val is filled.
1271  execution_space().fence ();
1272 
1273  // Allocate the array of (key,value) pairs. Don't fill it with
1274  // zeros, because we will fill it with actual data below.
1275  using Kokkos::ViewAllocateWithoutInitializing;
1276  typedef typename val_type::non_const_type nonconst_val_type;
1277  nonconst_val_type val (ViewAllocateWithoutInitializing ("FixedHashTable::pairs"),
1278  theNumKeys);
1279 
1280  // Fill in the hash table's "values" (the (key,value) pairs).
1281  typedef FHT::FillPairs<typename val_type::non_const_type, keys_type,
1282  typename ptr_type::non_const_type> functor_type;
1283  typename functor_type::value_type result (initMinKey, initMaxKey);
1284 
1285  const ValueType newStartingValue = startingValue + static_cast<ValueType> (startIndex);
1286  if (buildInParallel) {
1287  functor_type functor (val, counts, ptr, theKeys, newStartingValue,
1288  initMinKey, initMaxKey);
1289  typedef Kokkos::RangePolicy<execution_space, offset_type> range_type;
1290  Kokkos::parallel_reduce (range_type (0, theNumKeys), functor, result);
1291  }
1292  else {
1293  for (offset_type k = 0; k < theNumKeys; ++k) {
1294  typedef typename hash_type::result_type hash_value_type;
1295  const KeyType key = theKeys[k];
1296  if (key > result.maxKey_) {
1297  result.maxKey_ = key;
1298  }
1299  if (key < result.minKey_) {
1300  result.minKey_ = key;
1301  }
1302  const ValueType theVal = newStartingValue + static_cast<ValueType> (k);
1303  const hash_value_type hashVal = hash_type::hashFunc (key, size);
1304 
1305  // Return the old count; decrement afterwards.
1306  //
1307  // NOTE (mfh 28 Mar 2016) This assumes UVM. It might make more
1308  // sense to use a host mirror here, due to the access pattern.
1309  const offset_type count = counts[hashVal];
1310  --counts[hashVal];
1311  if (count == 0) {
1312  result.success_ = false; // FAILURE!
1313  break;
1314  }
1315  else {
1316  // NOTE (mfh 28 Mar 2016) This assumes UVM.
1317  const offset_type curPos = ptr[hashVal+1] - count;
1318 
1319  // NOTE (mfh 28 Mar 2016) This assumes UVM.
1320  val[curPos].first = key;
1321  val[curPos].second = theVal;
1322  }
1323  }
1324  }
1325 
1326  // FIXME (mfh 01 Jun 2015) Temporarily commented out because of
1327  // reports of exceptions being thrown in Albany.
1328  //
1329  // TEUCHOS_TEST_FOR_EXCEPTION
1330  // (! result.success_, std::logic_error, "Tpetra::Details::FixedHashTable::"
1331  // "init: Filling the hash table failed! Please report this bug to the "
1332  // "Tpetra developers.");
1333  (void) result; // prevent build warning (set but unused variable)
1334 
1335  // "Commit" the computed arrays and other computed quantities.
1336  ptr_ = ptr;
1337  val_ = val;
1338  minKey_ = result.minKey_;
1339  maxKey_ = result.maxKey_;
1340  // We've already set firstContigKey_ and lastContigKey_ above.
1341 }
1342 
1343 
1344 template<class KeyType, class ValueType, class DeviceType>
1345 void
1346 FixedHashTable<KeyType, ValueType, DeviceType>::
1347 init (const host_input_keys_type& keys,
1348  const host_input_vals_type& vals,
1349  KeyType initMinKey,
1350  KeyType initMaxKey)
1351 {
1352  const offset_type numKeys = static_cast<offset_type> (keys.extent (0));
1353  TEUCHOS_TEST_FOR_EXCEPTION
1354  (static_cast<unsigned long long> (numKeys) > static_cast<unsigned long long> (::Kokkos::Details::ArithTraits<ValueType>::max ()),
1355  std::invalid_argument, "Tpetra::Details::FixedHashTable: The number of "
1356  "keys " << numKeys << " is greater than the maximum representable "
1357  "ValueType value " << ::Kokkos::Details::ArithTraits<ValueType>::max () << ".");
1358  TEUCHOS_TEST_FOR_EXCEPTION
1359  (numKeys > static_cast<offset_type> (INT_MAX), std::logic_error, "Tpetra::"
1360  "Details::FixedHashTable: This class currently only works when the number "
1361  "of keys is <= INT_MAX = " << INT_MAX << ". If this is a problem for you"
1362  ", please talk to the Tpetra developers.");
1363 
1364  // There's no need to find the first and last initial contiguous
1365  // keys in this case, because if we reach this init() function, then
1366  // hasContiguousValues() is false and so get() doesn't use the
1367  // initial contiguous sequence of keys.
1368 
1369  const offset_type size = hash_type::getRecommendedSize (numKeys);
1370 #ifdef HAVE_TPETRA_DEBUG
1371  TEUCHOS_TEST_FOR_EXCEPTION(
1372  size == 0 && numKeys != 0, std::logic_error,
1373  "Tpetra::Details::FixedHashTable constructor: "
1374  "getRecommendedSize(" << numKeys << ") returned zero, "
1375  "even though the number of keys " << numKeys << " is nonzero. "
1376  "Please report this bug to the Tpetra developers.");
1377 #endif // HAVE_TPETRA_DEBUG
1378 
1379  // NOTE (mfh 14 May 2015) This method currently assumes UVM. We
1380  // could change that by setting up ptr and val as Kokkos::DualView
1381  // instances. If we do that, since we are filling on host for now,
1382  // we want to make sure that we only zero-fill ptr on host
1383  // initially, and that we don't fill val at all. Once we finish
1384  // Kokkos-izing all the set-up kernels, we won't need DualView for
1385  // either ptr or val.
1386 
1387  typename ptr_type::non_const_type ptr ("FixedHashTable::ptr", size + 1);
1388 
1389  // Allocate the array of key,value pairs. Don't waste time filling
1390  // it with zeros, because we will fill it with actual data below.
1391  using Kokkos::ViewAllocateWithoutInitializing;
1392  typedef typename val_type::non_const_type nonconst_val_type;
1393  nonconst_val_type val (ViewAllocateWithoutInitializing ("FixedHashTable::pairs"),
1394  numKeys);
1395 
1396  // Compute number of entries in each hash table position.
1397  for (offset_type k = 0; k < numKeys; ++k) {
1398  const typename hash_type::result_type hashVal =
1399  hash_type::hashFunc (keys[k], size);
1400  // Shift over one, so that counts[j] = ptr[j+1]. See below.
1401  ++ptr[hashVal+1];
1402  }
1403 
1404  // Compute row offsets via prefix sum:
1405  //
1406  // ptr[i+1] = \sum_{j=0}^{i} counts[j].
1407  //
1408  // Thus, ptr[i+1] - ptr[i] = counts[i], so that ptr[i+1] = ptr[i] +
1409  // counts[i]. If we stored counts[i] in ptr[i+1] on input, then the
1410  // formula is ptr[i+1] += ptr[i].
1411  for (offset_type i = 0; i < size; ++i) {
1412  ptr[i+1] += ptr[i];
1413  }
1414  //ptr[0] = 0; // We've already done this when initializing ptr above.
1415 
1416  // curRowStart[i] is the offset of the next element in row i.
1417  typename ptr_type::non_const_type curRowStart ("FixedHashTable::curRowStart", size);
1418 
1419  // Fill in the hash table.
1420  FHT::FillPairsResult<KeyType> result (initMinKey, initMaxKey);
1421  for (offset_type k = 0; k < numKeys; ++k) {
1422  typedef typename hash_type::result_type hash_value_type;
1423  const KeyType key = keys[k];
1424  if (key > result.maxKey_) {
1425  result.maxKey_ = key;
1426  }
1427  if (key < result.minKey_) {
1428  result.minKey_ = key;
1429  }
1430  const ValueType theVal = vals[k];
1431  if (theVal > maxVal_) {
1432  maxVal_ = theVal;
1433  }
1434  if (theVal < minVal_) {
1435  minVal_ = theVal;
1436  }
1437  const hash_value_type hashVal = hash_type::hashFunc (key, size);
1438 
1439  const offset_type offset = curRowStart[hashVal];
1440  const offset_type curPos = ptr[hashVal] + offset;
1441  if (curPos >= ptr[hashVal+1]) {
1442  result.success_ = false; // FAILURE!
1443  }
1444  else {
1445  val[curPos].first = key;
1446  val[curPos].second = theVal;
1447  ++curRowStart[hashVal];
1448  }
1449  }
1450 
1451  TEUCHOS_TEST_FOR_EXCEPTION
1452  (! result.success_, std::logic_error, "Tpetra::Details::FixedHashTable::"
1453  "init: Filling the hash table failed! Please report this bug to the "
1454  "Tpetra developers.");
1455 
1456  // "Commit" the computed arrays and other computed quantities.
1457  ptr_ = ptr;
1458  val_ = val;
1459  minKey_ = result.minKey_;
1460  maxKey_ = result.maxKey_;
1461  // We've already assigned to minVal_ and maxVal_ above.
1462 }
1463 
1464 template <class KeyType, class ValueType, class DeviceType>
1465 bool
1468 {
1469  if (! checkedForDuplicateKeys_) {
1470  hasDuplicateKeys_ = checkForDuplicateKeys ();
1471  checkedForDuplicateKeys_ = true;
1472  }
1473  return hasDuplicateKeys_;
1474 }
1475 
1476 template <class KeyType, class ValueType, class DeviceType>
1477 bool
1479 checkForDuplicateKeys () const
1480 {
1481  const offset_type size = this->getSize ();
1482  // It's allowed for the hash table to have a positive number of
1483  // buckets (getSize()), but still store no entries (numPairs()).
1484  // Both cases trivially entail no duplicates.
1485  if (size == 0 || this->numPairs () == 0) {
1486  return false;
1487  }
1488  else {
1489  typedef FHT::CheckForDuplicateKeys<ptr_type, val_type> functor_type;
1490  functor_type functor (val_, ptr_);
1491  int hasDupKeys = 0;
1492  typedef Kokkos::RangePolicy<execution_space, offset_type> range_type;
1493  Kokkos::parallel_reduce (range_type (0, size), functor, hasDupKeys);
1494  return hasDupKeys > 0;
1495  }
1496 }
1497 
1498 template <class KeyType, class ValueType, class DeviceType>
1499 std::string
1501 description () const
1502 {
1503  std::ostringstream oss;
1504  oss << "FixedHashTable<"
1505  << Teuchos::TypeNameTraits<KeyType>::name () << ","
1506  << Teuchos::TypeNameTraits<ValueType>::name () << ">: "
1507  << "{ numKeys: " << val_.extent (0)
1508  << ", tableSize: " << this->getSize () << " }";
1509  return oss.str();
1510 }
1511 
1512 template <class KeyType, class ValueType, class DeviceType>
1513 void
1515 describe (Teuchos::FancyOStream& out,
1516  const Teuchos::EVerbosityLevel verbLevel) const
1517 {
1518  using std::endl;
1519  using std::setw;
1520  using Teuchos::OSTab;
1521  using Teuchos::rcpFromRef;
1522  using Teuchos::TypeNameTraits;
1523  using Teuchos::VERB_DEFAULT;
1524  using Teuchos::VERB_NONE;
1525  using Teuchos::VERB_LOW;
1526  using Teuchos::VERB_EXTREME;
1527 
1528  // NOTE (mfh 14 May 2015) This method currently assumes UVM for
1529  // access to ptr_ and val_ from the host.
1530 
1531  Teuchos::EVerbosityLevel vl = verbLevel;
1532  if (vl == VERB_DEFAULT) vl = VERB_LOW;
1533 
1534  if (vl == VERB_NONE) {
1535  // do nothing
1536  }
1537  else if (vl == VERB_LOW) {
1538  out << this->description() << endl;
1539  }
1540  else { // MEDIUM, HIGH or EXTREME
1541  out << "FixedHashTable:" << endl;
1542  {
1543  OSTab tab1 (rcpFromRef (out));
1544 
1545  // const std::string label = this->getObjectLabel ();
1546  // if (label != "") {
1547  // out << "label: " << label << endl;
1548  // }
1549  out << "Template parameters:" << endl;
1550  {
1551  OSTab tab2 (rcpFromRef (out));
1552  out << "KeyType: " << TypeNameTraits<KeyType>::name () << endl
1553  << "ValueType: " << TypeNameTraits<ValueType>::name () << endl;
1554  }
1555 
1556  const offset_type tableSize = this->getSize ();
1557  const offset_type numKeys = val_.extent (0);
1558 
1559  out << "Table parameters:" << endl;
1560  {
1561  OSTab tab2 (rcpFromRef (out));
1562  out << "numKeys: " << numKeys << endl
1563  << "tableSize: " << tableSize << endl;
1564  }
1565 
1566  if (vl >= VERB_EXTREME) {
1567  out << "Contents: ";
1568  if (tableSize == 0 || numKeys == 0) {
1569  out << "[]" << endl;
1570  } else {
1571  out << "[ " << endl;
1572  {
1573  OSTab tab2 (rcpFromRef (out));
1574  for (offset_type i = 0; i < tableSize; ++i) {
1575  OSTab tab3 (rcpFromRef (out));
1576  out << "[";
1577  for (offset_type k = ptr_[i]; k < ptr_[i+1]; ++k) {
1578  out << "(" << val_[k].first << "," << val_[k].second << ")";
1579  if (k + 1 < ptr_[i+1]) {
1580  out << ", ";
1581  }
1582  }
1583  out << "]" << endl;
1584  } // for each table position i
1585  }
1586  out << "]" << endl;
1587  } // The table contains entries
1588  } // vl >= VERB_EXTREME
1589  }
1590  out << endl;
1591  } // if vl > VERB_LOW
1592 }
1593 
1594 } // namespace Details
1595 } // namespace Tpetra
1596 
1597 // Macro that explicitly instantiates FixedHashTable for the given local
1598 // ordinal (LO) and global ordinal (GO) types. Note that FixedHashTable's
1599 // template parameters occur in the opposite order of most Tpetra
1600 // classes. This is because FixedHashTable performs global-to-local
1601 // lookup, and the convention in templated C++ lookup tables (such as
1602 // std::map) is <KeyType, ValueType>.
1603 //
1604 // This macro must be explanded within the Tpetra::Details namespace.
1605 #define TPETRA_DETAILS_FIXEDHASHTABLE_INSTANT_DEFAULTNODE(LO,GO) \
1606  template class FixedHashTable< GO , LO >;
1607 
1608 // Macro that explicitly instantiates FixedHashTable for the given
1609 // local ordinal (LO), global ordinal (GO), and Kokkos device (DEVICE)
1610 // types. Note that FixedHashTable's first two template parameters
1611 // occur in the opposite order of most Tpetra classes. This is
1612 // because FixedHashTable performs global-to-local lookup, and the
1613 // convention in templated C++ lookup tables (such as std::map) is
1614 // <KeyType, ValueType>.
1615 //
1616 // This macro must be explanded within the Tpetra::Details namespace.
1617 #define TPETRA_DETAILS_FIXEDHASHTABLE_INSTANT(LO, GO, DEVICE) \
1618  template class FixedHashTable< GO , LO , DEVICE >;
1619 
1620 #endif // TPETRA_DETAILS_FIXEDHASHTABLE_DEF_HPP
CheckForDuplicateKeys(const pairs_view_type &pairs, const offsets_view_type &ptr)
Constructor.
Parallel reduce functor for filling the FixedHashTable, and computing the min and max keys...
FillPairs(const pairs_view_type &pairs, const counts_view_type &counts, const offsets_view_type &ptr, const keys_view_type &keys, const typename pair_type::second_type startingValue, const key_type initMinKey, const key_type initMaxKey)
Constructor that takes initial min and max key values.
KOKKOS_INLINE_FUNCTION void init(value_type &dst) const
Set the initial value of the reduction result.
KOKKOS_INLINE_FUNCTION void join(volatile value_type &dst, const volatile value_type &src) const
Combine two intermediate reduction results.
KOKKOS_INLINE_FUNCTION void operator()(const size_type &i) const
Do this for every entry of keys_.
void describe(Teuchos::FancyOStream &out, const Teuchos::EVerbosityLevel verbLevel=Teuchos::Describable::verbLevel_default) const
Print this object with the given verbosity to the output stream.
ResultType result_type
Type of the return value of the hash function.
Parallel for functor for counting &quot;buckets&quot; in the FixedHashTable.
static bool debug()
Whether Tpetra is in debug mode.
bool hasKeys() const
Whether it is safe to call getKey().
bool success_
Whether fill succeeded (it can only fail on a bug)
void deep_copy(MultiVector< DS, DL, DG, DN > &dst, const MultiVector< SS, SL, SG, SN > &src)
Copy the contents of the MultiVector src into dst.
The hash function for FixedHashTable.
Declare and define the functions Tpetra::Details::computeOffsetsFromCounts and Tpetra::computeOffsets...
std::string description() const
Implementation of Teuchos::Describable interface.
KOKKOS_INLINE_FUNCTION void operator()(const size_type &i, value_type &dst) const
Parallel loop body; do this for every entry of keys_.
FixedHashTable()
Default constructor; makes an empty table.
FillPairs(const pairs_view_type &pairs, const counts_view_type &counts, const offsets_view_type &ptr, const keys_view_type &keys, const typename pair_type::second_type startingValue)
Constructor.
KOKKOS_INLINE_FUNCTION void init(value_type &dst) const
Set the initial value of the reduction result.
bool hasDuplicateKeys()
Whether the table has any duplicate keys.
static KOKKOS_INLINE_FUNCTION result_type hashFunc(const argument_type &key, const offset_type &size)
The hash function.
CountBuckets(const counts_view_type &counts, const keys_view_type &keys, const size_type size)
Constructor.
KOKKOS_INLINE_FUNCTION void init(value_type &dst) const
Set the initial value of the reduction result.
Functor for checking whether a FixedHashTable has one or more duplicate entries.
OffsetsViewType::non_const_value_type computeOffsetsFromCounts(const ExecutionSpace &execSpace, const OffsetsViewType &ptr, const CountsViewType &counts)
Compute offsets from counts.
Reduction result for FillPairs functor below.
KOKKOS_INLINE_FUNCTION void operator()(const size_type &i, value_type &dst) const
Parallel loop body.
Kokkos::View< const KeyType *, Kokkos::LayoutLeft, device_type > keys_type
Type of a 1-D Kokkos::View (array) used to store keys.
Declaration of Tpetra::Details::Behavior, a class that describes Tpetra&#39;s behavior.