사용법 1  :   클래스의 멤버로 사용 


public enum Currency {PENNY, NICKLE, DIME, QUARTER};

Currency coin = Currency.PENNY;

coin = 1; //compilation error


사용법 2  :   클래스의 멤버로 사용 (값을 지정) 


public enum Currency {PENNY(1), NICKLE(5), DIME(10), QUARTER(25)};


사용법 3  :   Switch 문의 인자로 사용  


Currency usCoin = Currency.DIME;
    switch (usCoin) {
            case PENNY:
                    System.out.println("Penny coin");
                    break;
            case NICKLE:
                    System.out.println("Nickle coin");
                    break;
            case DIME:
                    System.out.println("Dime coin");
                    break;
            case QUARTER:
                    System.out.println("Quarter coin");
    }


사용법 4  :   enum 안에 정의된 상수들은 final 이라, == 으로 비교가능 


Currency usCoin = Currency.DIME;
    if(usCoin == Currency.DIME){
       System.out.println("enum in java can be"+
               "compared using ==");
    }


사용법 5  :   enum 안에 정의된 상수들은 final 이라, == 으로 비교가능 


사용법 6  :  메소드 확장하기 


public enum SizeEnum {


  SMALL("S"), MEDIUM("M"), LARGE("L");

  // Fields

  private String mAbbreviation;

  // Constructor

  private SizeEnum(String abbreviation) {

   mAbbreviation = abbreviation;

  }

   

  // Methods

  public String getAbbreviation() { return mAbbreviation; }

  public void setAbbreviation(String abbreviation) { mAbbreviation = abbreviation; }

 

}


사용법 7  :  Enum 클래스로 사용하기 


public enum FontStyle {
    NORMAL, BOLD, ITALIC, UNDERLINE;
 
    FontStyle() {
    }
}

인라인 클래스로 사용  ( sample.FontStyle.NORMAL  접근) 

public enum SampleClass {
    public enum FontStyle { NORMAL, BOLD, ITALIC, UNDERLINE }
    ...
}


사용법 7  :  EnumSet 사용하기

import java.util.EnumSet;

import java.util.Set; /** * Simple Java Program to demonstrate how to use EnumSet. * It has some interesting use cases and it's specialized collection for * Enumeration types. Using Enum with EnumSet will give you far better * performance than using Enum with HashSet, or LinkedHashSet. * * @author Javin Paul */ public class EnumSetDemo { private enum Color { RED(255, 0, 0), GREEN(0, 255, 0), BLUE(0, 0, 255); private int r; private int g; private int b; private Color(int r, int g, int b) { this.r = r; this.g = g; this.b = b; } public int getR() { return r; } public int getG() { return g; } public int getB() { return b; } } public static void main(String args[]) { // this will draw line in yellow color EnumSet<Color> yellow = EnumSet.of(Color.RED, Color.GREEN); drawLine(yellow); // RED + GREEN + BLUE = WHITE EnumSet<Color> white = EnumSet.of(Color.RED, Color.GREEN, Color.BLUE); drawLine(white); // RED + BLUE = PINK EnumSet<Color> pink = EnumSet.of(Color.RED, Color.BLUE); drawLine(pink); } public static void drawLine(Set<Color> colors) { System.out.println("Requested Colors to draw lines : " + colors); for (Color c : colors) { System.out.println("drawing line in color : " + c); } } } Output: Requested Colors to draw lines : [RED, GREEN] drawing line in color : RED drawing line in color : GREEN Requested Colors to draw lines : [RED, GREEN, BLUE] drawing line in color : RED drawing line in color : GREEN drawing line in color : BLUE Requested Colors to draw lines : [RED, BLUE] drawing line in color : RED drawing line in color : BLUE



사용법 8  :  EnumSet 으로 비트필드 대체하기 

package resolver;

public class IntEnumPatternExample {

    public static final int STYLE_BOLD          = 1 << 0; // 1
    public static final int STYLE_ITALIC        = 1 << 1; // 2
    public static final int STYLE_UNDERLINE     = 1 << 2; // 4
    public static final int STYLE_STRIKETHROUGH = 1 << 3; // 8

    public static void main(String[] args) {
        final IntEnumPatternResolver resolver = new IntEnumPatternResolver();
        resolver.enableAll(STYLE_BOLD, STYLE_ITALIC, STYLE_STRIKETHROUGH, STYLE_UNDERLINE);
        resolver.disable(STYLE_STRIKETHROUGH);
        resolver.toggle(STYLE_UNDERLINE);
        print(resolver);
    }

    private static void print(IntEnumPatternResolver resolver) {
        assert resolver.isEnabled(STYLE_BOLD) == true;
        assert resolver.isEnabled(STYLE_ITALIC) == true;
        assert resolver.isEnabled(STYLE_UNDERLINE) == false;
        assert resolver.isEnabled(STYLE_STRIKETHROUGH) == false;

        System.out.println("STYLE_BOLD: " + resolver.isEnabled(STYLE_BOLD));
        System.out.println("STYLE_ITALIC: " + resolver.isEnabled(STYLE_ITALIC));
        System.out.println("STYLE_UNDERLINE: " + resolver.isEnabled(STYLE_UNDERLINE));
        System.out.println("STYLE_STRIKETHROUGH: " + resolver.isEnabled(STYLE_STRIKETHROUGH));
    }

}
package resolver;

import java.util.EnumSet;

public class EnumPatternExample {

    public enum Style {
        BOLD, ITALIC, UNDERLINE, STRIKETHROUGH
    }

    public static void main(String[] args) {
        final EnumSet<Style> styles = EnumSet.noneOf(Style.class);
        styles.addAll(EnumSet.range(Style.BOLD, Style.STRIKETHROUGH)); // enable all constants
        styles.removeAll(EnumSet.of(Style.UNDERLINE, Style.STRIKETHROUGH)); // disable a couple
        assert EnumSet.of(Style.BOLD, Style.ITALIC).equals(styles); // check set contents are correct
        System.out.println(styles);
    }

}

http://claude-martin.ch/enumbitset/  완전한 EnumBitSet 라이브러리 


사용법 8  :  EnumMap 사용하기 


enum Importance {
Low, Medium, High, Critical
}


EnumMap<Importance, String> enumMap = new EnumMap<>(Importance.class);


enumMap.put(Importance.Low, "=Low");
enumMap.put(Importance.High, "=High");


String value1 = enumMap.get(Importance.Low);
String value2 = enumMap.get(Importance.High);


package program;
import java.util.EnumMap;
public class Program {
    // 1. Create an Enum.
    enum Importance {
Low, Medium, High, Critical
    }
    public static void main(String[] args) {
        // 2. Create an EnumMap.
        EnumMap<Importance, String> enumMap = new EnumMap<>(Importance.class);
        // 3. PUT values into the map.
        enumMap.put(Importance.Low, "=Low");
        enumMap.put(Importance.High, "=High");
        // 4. Get values from the map.
        String value1 = enumMap.get(Importance.Low);
        String value2 = enumMap.get(Importance.High);
        System.out.println(value1);
        System.out.println(value2);
    }
}



Java volatile



- volatile 변수를 읽어 들일 때 CPU 캐시가 아니라 컴퓨터의 메인 메모리로 부터 읽어들입니다.
그리고 volatile 변수를 쓸 때에도(write) CPU 캐시가 아닌 메인 메모리에 기록합니다.

 - non-volatile 변수들은 어느 시점에 Java Virtual Machine(JVM)이 메인 메모리로 부터 데이터를 읽어 CPU 캐시로 읽어 들이거나 혹은 CPU 캐시들에서 메인 메모리로 데이터를 쓰는지(write) 보장해 줄 수 없습니다. 

- 이때 volatile 을 쓰면, 메모리에 있는 최신의 값을 보기때문에 문제의 소지를 없앨가능성을 높힙니다.

- 하지만 모든걸 해결해주진  못합니다.

- 멀티쓰레드에서 하나의 volatile 변수를 접근할때, 하나의 쓰레드가 아직 메모리에 못 썼다면 , 나머지 쓰레드는
  역시 이전값을 가지게 될것이며,  또는 두개의 쓰레드가 동시에 하나의 변수를 가져다가 작업을 하면 마지막에   쓰여진 값이 최종값이 될것입니다. 두 쓰레드간의 하나의 변수에 대한 동기화가 깨진상태가 됩니다.

- race condition 이 발생할 염려가 되는곳이라면 , synchronized으로 보장해주는것이 필요해보입니다.

java.util.concurrent package 의 AtomicLong or AtomicReference 를 사용하세요.  

- (역주: while (상태변수) {  ....  notify()  } 이런 류의 코드의  상태 변수는 반드시 volatile 해주는것이 좋습니다.)

- volatile 만으로 충분한 상황은  하나의 쓰레드는 쓰고/읽기 , 나머지는 읽기만 한다면 락 없어도 가능합니다.

- volatile 키워드는 32 bit 변수에서만 기능을 보장합니다. 64bit 는 실패될수있습니다. 

- volatile 키워드는 성능에는 아주 조금 안좋겠지요. 필요할때만 사용하십시요. 

Java 5의 volatile 키워드는 단순히 변수를 메인 메모리로 부터 읽고 쓰는것 이상을  보장(guarantees) 해
  줍니다. 아래 내용을 읽어보세요 ( 오해의 소지가 많을듯해서  원문을 직접보시는게 나음) 


The Java volatile Happens-Before Guarantee



C/C++  volatile


volatile 키워드는 앞서 살펴본 하드웨어 제어를 포함하여 크게 3가지 경우에 흔히 사용된다.

 

(1) MMIO(Memory-mapped I/O)

(2) 인터럽트 서비스 루틴(Interrupt Service Routine)의 사용

(3) 멀티 쓰레드 환경


세 가지 모두 공통점은 현재 프로그램의 수행 흐름과 상관없이 외부 요인이 변수 값을 변경할 수 있다는 점이다. 인터럽트 서비스 루틴이나 멀티 쓰레드 프로그램의 경우 일반적으로 스택에 할당하는 지역 변수는 공유하지 않으므로, 서로 공유되는 전역 변수의 경우에만 필요에 따라 volatile을 사용하면 된다.


int done = FALSE;

void main()

{

     ...

     while (!done)

     {

         // Wait

     }

     ...

}

 

interrupt void serial_isr(void)

{

     ...

     if (ETX == rxChar)

     {

         done = TRUE;

     }

     ...

} 

serial.c


위 시리얼 통신 예제는 전역 변수로 done을 선언해서 시리얼 통신 종료를 알리는 ETX 문자를 받으면 main 프로그램을 종료시킨다. 문제는 done이 volatile이 아니므로 main 프로그램은 while(!done)을 수행할 때 매번 메모리에서 done을 새로 읽어오지 않는다는 점이다. 따라서 serial_isr() 루틴이 done 플래그를 수정하더라도 main은 이를 모른 채 계속 루프를 돌고 있을 수 있다. done을 volatile로 선언해주면 매번 메모리에서 변수 값을 새로 읽어오므로 이 문제가 해결된다.

인터럽트의 경우와 마찬가지로 멀티 쓰레드 프로그램도 수행 도중에 다른 쓰레드가 전역 변수 값을 임의로 변경할 수 있다. 하지만 컴파일러가 코드를 생성할 때는 다른 쓰레드의 존재 여부를 모르므로 변수 값이 변경되지 않았다면 매번 새롭게 메모리에서 값을 읽어오지 않는다. 따라서 여러 쓰레드가 공유하는 전역 변수라면 volatile로 선언해주거나 명시적으로 락(lock)을 잡아야 한다.

이처럼 레지스터를 재사용하지 않고 반드시 메모리를 참조할 경우 가시성(visibility) 이 보장된다고 말한다. 멀티쓰레드 프로그램이라면 한 쓰레드가 메모리에 쓴 내용이 다른 쓰레드에 보인다는 것을 의미한다.



레퍼런스

http://tutorials.jenkov.com/java-concurrency/volatile.html  <-- 자바 volatile 설명

http://skyul.tistory.com/337   <-- C/C++ volatile 설명 

'Java' 카테고리의 다른 글

자바 EnumBitSet 사용하기  (0) 2015.09.01
자바 enum 정리  (0) 2015.09.01
자바 Concurrent 라이브러리 정리  (0) 2015.08.31
자바 쓰레드 점유율 팁 ( 점유율을 공평하게 만들자)  (0) 2015.08.18
자바 쓰레드 테스트  (0) 2015.08.18


Executors 유틸리티 클래스를 이용하여 각종  쓰레드 풀 생성하기 


ExecutorService =  Executors.newFixedThreadPool(int nThreads)

최대 지정한 개수 만큼의 쓰레드를 가질 수 있는 쓰레드 풀을 생성한다. 실제 생성되는 객체는 ThreadPoolExecutor 객체이다.

항상 일정한 스레드 개수를 유지한다. 스레드가 유휴상태이더라도 제거하지 않고 유지한다.

다만 작업도중 비정상적으로 스레드가 종료하는 경우에는 스레드를 추가로 생성하며, nThreads 개수보다 1개가 더 생길 수

도 있다.


ScheduledExecutorService  =  Executors.newScheduledThreadPool(int corePoolSize)

지정한 개수만큼 쓰레드가 유지되는 스케줄 가능한 쓰레드 풀을 생성한다. 실제 생성되는 객체는

ScheduledThreadPoolExecutor 객체이다.



ExecutorService = Executors.newSingleThreadExecutor()

하나의 쓰레드만 사용하는 ExecutorService를 생성한다.

항상 1개의 스레드만 동작한다. 따라서 스레드가 동작중일 경우 나머지 작업은 모두 큐에서 대기하며, 순서대로 하나씩 실

행된다. 만약 비정상적으로 스레드가 종료되는 경우, 새로 스레드를 생성하고 남은 작업을 계속 한다.


ScheduledExecutorService = Executors. newSingleThreadScheduledExecutor()

하나의 쓰레드만 사용하는 ScheduledExecutorService를 생성한다.일정 시간 이후에 실행되거나 주기적으로 작업을 실행할 

수 있으며, 스레드의 수가 고정되어 있는 형태의 Executor.Timer  클래스의 기능과 유사하다 


ExecutorService = Executors.newCachedThreadPool()

필요할 때 마다 쓰레드를 생성하는 쓰레드 풀을 생성한다. 이미 생성된 쓰레드의 경우 재사용된다.

실제 생성되는 객체는 ThreadPoolExecutor 객체이다. 
스레드 개수에 제한이 없이 필요한 경우 계속 스레드 수가 증가한다.

다만 일정 시간(60초)동안 사용하지 않는(idle) 스레드는 종료된다.

필요없는 스레드를 제거하므로 서버 리소스(memory)는 적게 사용하지만, 스레드 생성과 삭제를 반복하므로 작업 부하가 

불규칙적인 경우 비효율적이다. 



(http://javacan.tistory.com/124  상세설명) 





CopyOnWriteArrayList 

CopyOnWrite 가 말해주는것처럼 read (select) 시는 아무런 동기화 문제가 없기때문에 놔두고 

변경이 일어날경우 객체를 clone 해서 다루자는 전략입니다. 따라서 읽기행위가 많이 일어나는 

곳에서 사용하기 좋습니다. 


BlockingQueue 

보통 생산자 - 소비자 패턴에서 활용되는 큐로 많이 사용된다. 사실 이야기는 이 큐는 멀티쓰레드환경에서 

대표할만한 컬렉션이라는 것이다. 전에 Actor / Akka 문서에 말한 큐같은것들이 대부분 이것으로 이루어져있다. 

소비자가 꺼내어 사용할동안 생산자는 멈춰있고, 생산자가 넣을동안 소비자는 멈춰있어야한다.

서로 쟁탈하면 선반은 망가질것이다.


ConcurrentHashMap

ConcurrentHashMap은 Map의 일부에만 Lock을 걸기때문에 HashTable과 synchronized Map 보다 

효율적인게 특징이다.




Runnable  :  결과값을 리턴하지 않는다 - void run()


Callable     :  결과 값을 리턴한다.  - V call()

 

Future        :  Callable의 리턴값은 실행시키지마자 얻을수있는게 아니라 미래에 얻게된다.
                      그 값을 받을수있는 인터페이스이다.



ThreadLocal  

함수안의 로컬변수는 쓰레드 마다  고유하게 가질수있는것은 알것이다. 그럼 쓰레드마다 고유의 변수를 해당 함수의 안 뿐만아니라, 클래스의 정적멤버등으로 생성하여  각각의  쓰레드가 다른곳에서 사용하고싶을땐?  그때 이것을 사용할수있다. 

ThreadLocal은 한 쓰레드에서 실행되는 코드가 동일한 객체를 사용할 수 있도록 해 주기 때문에 쓰레드와 관련된 코드에서 파라미터를 사용하지 않고 객체를 전파하기 위한 용도로 주로 사용되며, 주요 용도는 다음과 같다.

-사용자 인증정보 전파 - Spring Security에서는 ThreadLocal을 이용해서 사용자 인증 정보를 전파한다.
-트랜잭션 컨텍스트 전파 - 트랜잭션 매니저는 트랜잭션 컨텍스트를 전파하는 데 ThreadLocal을 사용한다.
-쓰레드에 안전해야 하는 데이터 보관

ThreadLocal 사용시 주의 사항

쓰레드 풀 환경에서 ThreadLocal을 사용하는 경우 ThreadLocal 변수에 보관된 데이터의 사용이 끝나면 반드시 해당 데이터를 삭제해 주어야 한다. 그렇지 않을 경우 재사용되는 쓰레드가 올바르지 않은 데이터를 참조할 수 있다.


CountDownLatch 

모든쓰레드(테스크)가 종료되면 , 호출될 필요가 있는곳에 사용된다.  관련  쓰레드들의  이벤트들을 감지 하기위해 

사용한다.


CyclicBarrier

CountDownLatch 와는 반대(?) 로 모든쓰레드가  종료가 아니라 블럭되면 , 호출될 필요가 있는곳에 사용된다.    

다른 쓰레드를 기다리기위해 사용한다고 볼수있다.

하마(HAMA) 라는 분산머신러닝에 사용되는 오픈소스가 있는데, BSP 알고리즘을 사용하는데 BSP 란 

각각의 컴퓨터가 일을 하고, 자신의 일이 끝나면 멈춰있게된다, 모든 컴퓨터들이 멈춰있게되면 ( 각각의 일을 끝마치면) 

서로 커뮤니케이션하는 구조인데,  전체 컴퓨터들의 분산락을 걸어주는것과 비슷한것이다.

분산락은 Zookeeper 라는 오픈소스를 사용한다.


Exchanger 

서로 다른 쓰레드에서 각각의 데이터 (컬렉션을 통채로도) 를 주고 받을수있게한다. 별로 쓸일 없어보인다.

public class ExchangerTest {  
 
    private static final int FULL = 5;
    private static final int COUNT = FULL * 2;
    private static final Random random = new Random();
   
    private static volatile int sum = 0;
    
    private static Exchanger<List<Integer>> exchanger =
        new Exchanger<List<Integer>>();
   
    private static CountDownLatch stopLatch =
        new CountDownLatch(2);
 
    private static List<Integer> initiallyEmptyBuffer;
    private static List<Integer> initiallyFillBuffer;
   
    private static class FillingLoop implements Runnable {
        public void run() {
            List<Integer> currentBuffer = initiallyFillBuffer;
            try {
                for (int i = 0; i < COUNT; i++) {
                    if (currentBuffer == null)
                        break;
                   
                    Integer item = random.nextInt(100);
                    System.out.println("Item Added: " + item);
                    currentBuffer.add(item);
                   
                    if (currentBuffer.size() == FULL) {
                        currentBuffer = exchanger.exchange(currentBuffer);
                    }
                }
            } catch (InterruptedException ex) {
                System.out.println("Bad exchange on filling side");
            }
            stopLatch.countDown();
        }
    }  
 
    private static class EmptyingLoop implements Runnable {
        public void run() {
            List<Integer> currentBuffer = initiallyEmptyBuffer;
            try {
                for (int i = 0; i < COUNT; i++) {
                    if (currentBuffer == null)
                        break;
 
                    if (currentBuffer.isEmpty()) {
                        currentBuffer = exchanger.exchange(currentBuffer);
                    }
                    
                    Integer item = currentBuffer.remove(0);
                    System.out.println("Item Got: " + item);
                    sum += item.intValue();
                   
                }
            } catch (InterruptedException ex) {
                System.out.println("Bad exchange on emptying side");
            }
            stopLatch.countDown();
        }
    }  
 
    public static void main(String args[]) {
        initiallyEmptyBuffer = new ArrayList<Integer>();
        initiallyFillBuffer = new ArrayList<Integer>();
   
        new Thread(new FillingLoop()).start();
        new Thread(new EmptyingLoop()).start();
   
        try {
            stopLatch.await();
        } catch (InterruptedException ex) {
            ex.printStackTrace();
        }
        System.out.println("Sum of all items is.... " + sum);
    }
}


FutureTask

모든쓰레드(테스크)가 종료되면 , 호출될 필요가 있는곳에 사용된다. 

public class Preloader {

private final FutureTask<List<String>> task1 
= new FutureTask<List<String>>(new MyCallable());
private final FutureTask<List<String>> task2
= new FutureTask<List<String>>(new MyCallable());
private final FutureTask<List<String>> task3
= new FutureTask<List<String>>(new MyCallable());
private final FutureTask<List<String>> task4
= new FutureTask<List<String>>(new MyCallable());
private final FutureTask<List<String>> task5
= new FutureTask<List<String>>(new MyCallable());
ExecutorService es = Executors.newFixedThreadPool(5);
public void testGO() throws Exception{
es.submit (task1);
Thread.sleep(300);
es.submit (task2);
Thread.sleep(300);
es.submit (task3);
Thread.sleep(300);
es.submit (task4);
Thread.sleep(300);
es.submit (task5);
PrintTest(task2);
PrintTest(task5);
PrintTest(task4);
PrintTest(task3);
PrintTest(task1);
Thread.sleep(20000);
es.shutdown();
} //End Of testGo
private void PrintTest(FutureTask<List<String>> tempTask){
List<String> k = null;
try {
k = tempTask.get();
} catch (InterruptedException e) {
System.out.println("Exception : "+e.getMessage());
} catch (ExecutionException e) {
System.out.println("Exception : "+e.getMessage());
} //End Of try
for(String l : k){
System.out.println(l);
}
} //End Of printTest
    public static void main(String[] args) {

     Preloader preloader = new Preloader();
    
     try {
preloader.testGO();
} catch (Exception e) {
System.out.println(e.getMessage());
} //End Of try
    
    } //End Of main
    
} //End Of Class



public class MyCallable implements Callable<List<String>>{

@Override
public List<String> call() throws Exception {
List<String> ret = new Vector<String>();
for(int i=0; i<10; i++){
String temp = "안녕하세요 "+Thread.currentThread()+" 번 입니다.";
Thread.sleep(300);
ret.add(temp);
}
return ret;
}
}


Semaphore

import java.util.Random;
import java.util.concurrent.Semaphore;

public class SemaphoreTest {

      private static final Random rd = new Random(10000);

      static class Log {
             public static void debug(String strMessage) {
             System.out.println(Thread.currentThread().getName()  + " : " + strMessage);
            }
        }

      class SemaphoreResource extends Semaphore {

            private static final long serialVersionUID = 1L;

            public SemaphoreResource(final int permits) {
                   super(permits);
            }

           public void use() throws InterruptedException {

                  acquire(); // 세마포어 리소스 확보

                 try {
                        doUse();
                 } finally {
                       release(); // 세마포어 리소스 해제
                       Log.debug("Thread 종료 후 남은  permits: " +   this.availablePermits());
                }
             }

           protected void doUse() throws InterruptedException {

                // 임의의 프로그램을 실행하는데 거리는 가상의 시간
               int sleepTime = rd.nextInt(500);
               Thread.sleep(sleepTime); // 런타임 시간 설정
               Log.debug(" Thread 실행..................." + sleepTime);

                              /** something logic **/

              }

        }

     class MyThread extends Thread {

            private final SemaphoreResource resource;

            public MyThread(String threadName, SemaphoreResource resource) {
                   this.resource = resource;
                   this.setName(threadName);
            }

            @Override
             public void run() {
                  try {
                      resource.use();
                  } catch (InterruptedException e) {
                  } finally { }
             }

        }

     public static void main(String... s) {

          System.out.println("Test Start...");
          SemaphoreResource resource =  new SemaphoreTest().new SemaphoreResource(4);

         for (int i = 0; i < 20; i++) {
                new SemaphoreTest().new MyThread("Thread-" + i, resource) .start();
          }

      }

}




Local-Trees-Code

Need to track hundreds of billions of data points? These Opower engineers’ Open Source software can help.

Last month, our colleague Greg Poirier wrote about Opower’s innovations with Open Source software, and specifically Wizardvan — an Opower open-source contribution that helps organize computational metrics on a large scale.

Today we’ll discuss a similar innovation that Opower developed and everyone can now use thanks to the nature of Open Source contributions. As our colleagues havepreviously mentioned, Open Source software is so valuable because the larger the community there is using it and working on it, the better it becomes.

THE CHALLENGE AND OPPORTUNITY OF LARGE DATA VOLUMES

Because Opower’s data volumes are scaling rapidly as we partner with more utilities and launch more programs, we have to take special care that our production systems and services always operate at optimal levels of performance. We measure that performance around the clock by tracking metrics on the type and amount of data flowing through our system.

Historically, we’ve utilized two traditional open-source systems to provide a backend to store our metrics data: Graphite and OpenTSDB. Both systems claim to to be scalable solutions for storing vast amounts of information about data metrics, but have different approaches to addressing the scalability challenge.

Graphite-and-OpenTSDB

Schematics of two OSS approaches that store metrics data: Graphite and OpenTSDB

As the amount of metrics data in our systems has continued to grow, we’ve begun to approach the limits of what existing versions of Graphite can support. For example, Graphite runs on “whisper“— a fixed-size database where each individual node can only store so much data. In addition, whisper’s archiving and time-stamping features aren’t nimble or efficient enough to support ultra-high volumes of data.

Where Graphite and its fixed-size database structure fall short, OpenTSDB can step in. OpenTSDB’s advantages span a range of features, including linear scaling and time-efficient scans.

However, as we’ve started to rely more on OpenTSDB for storing our metrics, we’ve found the need to add more functionality to support specific scenarios that stem from processing large and ever-growing streams of energy-related data. So, we did what’s become the natural thing: we implemented new functionality ourselves and shared our improvements back to the Open Source software community.

STRENGTHENING OPEN SOURCE SOFTWARE, STRENGTHENING OPOWER’S DATA SYSTEMS

Here are two important Open Source contributions we recently developed that show how our day-to-day experience with high-volume data processing allows us break new ground in Open Source software.

a) Metasync thread dead-locking

A benefit of having billions of rows of data in our system is that it can help us expose edge cases in Open Source software — especially edge cases that may not always be evident to the original maintainers of the Open Source repositories.

For example, in the case of OpenTSDB 2.0, we ran into a strange issue with running an operation called “metasync.” It would start properly and process data for about 5 minutes, then lock up suddenly and stop processing data. After some debugging work and looking at the code, we found the code block responsible:

Local Trees Code

A buggy block of code in a previous version of OpenTSDB software, which Opower engineers’ Open Source software contributions have helped improve

After 5 minutes, the OpenTSDB procedure would make a deferred call to reload some information (related to the new tree functionality, as shown in the code above). Unfortunately, this code block exited without releasing a mutual exclusion lock, which is required to ensure that multiple computational processes can run at once. As such, things would deadlock and no more data would be processed.

In this case, we were able to identify and fix a serious bug because we had the amount of data required to run the “metasync” operation long enough to produce this edge-case. This may not always be possible for the maintainers of the source repository because their testing data-sets may be much smaller (or they may not have the time or resources to run extended tests). We submitted this bug fix to the OpenTSDB repository, and it’s included in the upcoming 2.0 release.

b) Support for open-ended queries

Here’s another case in which our large data volumes enabled us to identify and rectify a procedural limitation of OpenTSDB.

When running open-ended queries, it’s not always known beforehand how many data points any specific computational metric will produce. This situation is fine when a metric’s count of data points sits in the tens or hundreds of thousands. However, we have some very common metrics such as Input/Output performance and central processing utilization that are computed for all systems and have millions of data points.

Whenever we ran into an open-ended query of this kind, OpenTSDB would continue trying to retrieve data until eventually all of the worker threads would be occupied by these gigantic queries and the process would eventually stall.

Our solution was straightforward: add an option to OpenTSDB to abort querying for more data after a specified timeout was exceeded. This allowed us to bypass queries in a set amount of time (e.g., 60 seconds) instead of being stuck in an indefinite waiting pattern. An added benefit is that if people accidentally query for too much data, it will prevent them from crashing any given process in OpenTSDB. This feature is also beneficial to other OpenTSDB functionality because different portions of OpenTSDB share Input/Output worker capacity.

We felt that this feature was broadly useful, so we contributed it back to the upstream open-source OpenTSDB repository for the 2.1 release. It’s an entirely optional configuration parameter, so regular users with less data won’t need to worry about it,. But it’s available for users who need this sort of functionality to keep their OpenTSDB nodes functional during workloads that query large datasets. Additionally, users that do receive timeouts can retry the same query with a different downsampling rate.

FINAL THOUGHTS

By applying the above improvements to OpenTSDB, we’ve made our systems more stable and ensured we can continue to support an ever-growing amount of metrics data. In our day-to-day work of processing uniquely large utility data streams, we’ve been able to break new ground in using and refining powerful Open Source tools. By making improvements to OpenTSDB’s metasync operation, we’re helping large-scale data users around the world (including ourselves) reliably generate metadata about their metrics. And by building in new support for open-ended queries, we’ve provided the ability to time out and re-optimize queries for stall-prone scenarios where waiting forever is not an option.

http://www.washingtonpost.com/news/energy-environment/wp/2015/03/03/why-knowing-your-energy-personality-could-help-save-you-a-lot-of-money/  번역 


Knowing your “energy personality” can save you a lot of money


   

For much of humanity today, getting out of bed is followed, very closely, by turning on a bunch of stuff. We crank up the heat. We start the coffee. We click on the morning news.

Some of us then go to work — and turn the stuff off again before we leave. Some of us don’t (either work, or turn our stuff off). Some of us get home from work and crash — but some of us stay up late, with lots of lights on, watching television, listening to music, working on the computer.

More stuff — sucking up electricity.

These patterns comprise what you might call our diverse “energy personalities” — which are theoretically capable of being quantified in terms of how much power we use across the day. Heck, these personalities could ultimately be reduced all the way down to flows of electrons. But the data have never been easily accessible to create such profiles — or, at least, not until now.

Opower is an Arlington, Va.-based software firm that works with power companies to help them better connect with their customers and, potentially, change their customers’ behavior. This role gives the company access to a ton of data, including from smart meters, which record our energy use and convey it back to utility companies in intervals of 15 minutes or less. (Opower protects customer data and confidentiality; you can read its privacy principleshere.)

Using this data, Opower’s Nancy Hersh, the company’s vice president of analytics, recently plotted the energy use of more than 800,000 homes over a 24-hour period. The resulting figure (not shown), in Hersh’s words, resembled a “hairball.” It was a blur of tangled lines, running horizontally, zigzagging from midnight to midnight.

But after Hersh and her team applied some “exploratory techniques” to the data, they saw that there were actually five separate clusters within the seemingly chaotic mess — representing five separate lifestyles that people tend to pursue. “The hairball can actually be untangled into these five separate ways that customers are living,” Hersh says.

Here’s the result — for a representative group of 1,000 customers, not the full 800,000:

There’s several things to note about this image. One is that even when people are sleeping, they’re never using zero power. That’s because of all the objects in the home, like cable boxes and cable televisions and various Internet-connected devices, that never actually turn off.

[“Your home is full of devices that never turn off. And they’re costing you a lot of money“]

The second thing to notice is that there are five well-trod paths for using energy over the course of the day —- or, if you’ll permit, five major personalities. “Every customer that we have smart meter data on can be classified into one of those five curves,” Hersh says.

Hersh and her team gave five names for the five curves and the types of people they represent:

Then, based on additional demographic data, they examined what kinds of people tend to be Daytimers, vs. Evening Peakers, vs. Steady Eddies.

“The Night Owl group skews young and apartment condo dwellers. And the Daytimers skew old with few children,” Hersh says. “And the thing I love about this, compare Night Owls to Daytimers, just the curves themselves, they’re mirror opposites.”

What of the other shapes? According to Hersh, Steady Eddies tend to live in condos and have a high level of winter energy usage from electric heating; Evening Peakers represent single-family homes that use a lot of power in the summer on air conditioning; and Twin Peaks tends to be wealthier families, also in single-family homes whose heat source is electric, not gas.

Data like these aren’t just a cool curiosity — they have big relevance for utility companies and for their customers. Why? Because depending on your energy personality, there are likely to be very different ways and strategies for you to save money and reduce your energy use.

“Let’s imagine there’s two energy-efficiency programs, one aimed at reducing consumption during peak periods and a different one that’s aimed at reducing base load,” Hersh says. “I want the one that’s useful to me. So when I have a big peak in the afternoon, having the program that’s aimed at reducing peak [usage] is going to be much more relevant to me. Whereas if I’m flat like a Steady Eddie, the big lever for me is not peak, it’s reducing my baseline. So it gives me more control over my energy use.”

So how can you figure out your energy personality, so that you can start saving? Currently, Opower doesn’t produce these charts for individuals, though Hersh says it is considering whether to do so.

In the meantime, if you have a smart meter, your utility probably provides an online portal where you can see your power usage, hour by hour. It probably won’t look like the curves above, but it will help you get a sense. By just knowing your own habits and patterns, you may also be able to guess which group you’re in and let that inform how you use power in your home.

“The ability to bring alive consumer smart meter data is a super exciting area for me,” Hersh says. “For the first time, utilities have just hoards and hoards and hoards of smart meter data, and it’s just sitting there. This changes around this paradigm.”


년전부터, 아마 머신러닝에 대해서 들어봤을것이다. 이것은  숨겨진 패턴을 찾기위해  거대한 데이터셋을 빠르게 뒤지는 과학분야이다. 그러한 패턴은 그동안 회사들이 풀기힘들었던  문제를 풀수있도록 해주고있다.  머신러닝 알고리즘은 스팸을 제거하거나 , 누군가 당신의 신용카드를 사용할 가능성있을때 미리 경고를 보내주기도한다.  앞으로는 당신의 생명을 구할수도있을것이다.  they might save your life.

코어에서, 머신러닝 도구들은 많은 복잡한 정보들을 낚아채고 그것으로 부터 학습하고  미래의 사건을 예측하거나 알려지지않은것들에 대해 더 나은 평가를 하기위해 배운것을 적용한다. 틀에 박힌 분석에 대해 도전을 하는 거대한 데이터셋의 파수꾼으로써 , 유틸리티들은 거대한 방법의 머신러닝으로 부터 이득을 얻을수있다.  여기 그런것들을 해결할수있는  7 가지의 기반 지니스 에 대한  도전이 있다.


1. 숨겨진 에너지사용 패턴을 밝혀라

유틸리티들은 고객경험을 개인화하기 위해 뛰고있고 . 소비자 세계는 기업들이 혜안을가지고 잘 껴맞추어서 제공하기를 바라고있다. 머신러닝은 그런것을 도울수있고 , Opower 에서는 그들의 개인적인 에너지 사용 행동양식에 의한 고객의 단편화 와 고객의 에너지 사용습관에서 숨겨진 트랜드를 밝혀내기위해 그것을 사용한다. 그리고나서 소비자들을 그들과 관련된 정보 와 함께 목표로 삼는다.

예를들어,  낮시간동안 집에 있지 않은 사람들은  프로그래밍된 자동온도조절장치에 의해 이득을 취할수있고, 그런 사람들은 수요대응(조절) 프로그램에 대해 열려있다.

감독되지 않은 머신러닝( Unsupervised machine learning ) 은 그것을 가능하게한다.우리는 수십만 고객들에 대해 하루의 각 시간당 평균 에너지사용량을 가지고 시작했다. 한번에 이 데이터를 도시하여 우리는 아래와 같은 판독하기힘든 머리카락이 뭉친듯한 그래프를 얻었다. 

hairball_600px

그러나 데이타에 대해 군집기술 을 적용하면 , 어떤 사용패턴을 밝혀낼수있다.  “energy personalities:”

load curves_600px

Utilities 유틸리티는 그들의 소비자들을 각각의 에너지 성향(energy personalities) 에 따라 다르게 다룰수있어야한다.  그래야만 ,  마켓팅은 좀더 잘 이루어질수있으며 , 수요측 관리결과는 증대되며, 집과 회사에서 에너지 개인화에 대한 경험을 깊숙히 갖게된다.


2. 많은 사람들을 유틸리티 프로그램에 참여 시키기

에너지 효울화와 수요대응에서 참여자의 비율은 악명높게 낮다. 전통적으로 유틸리티들은  엄선된 통계학적 구분의 무리에서  소비자들을 깨우치는것에 의해  (BY BREAKING CUSTOMERS INTO A HANDFUL OF HAND-PICKED DEMOGRAPHIC SEGMENTS)  그들을 고취시키려고하는데 ,  그리고나서 각각의  무리에게 맞춤 프로그램으로  마켓팅한다. 높은 수입을 가진 소비자는 비싼 가정용기기의 환불 프로그램의 이득을 취하기 쉽다.  예를들어 감독된 머신러닝(Supervised machine learning ) 은 더 나은 접근법을 제공한다. 만약 당신이 어떤 소비자들이 과거에 어떤 프로그램에 참여했는지 안다면, 미래에 누가 참여할것인지에 대해서  좀더 모델을 정확히 훈련시킬수 있게된다.  그리고 자동적으로 프로그램 타겟팅을 옵티마이즈할수있게된다.

초기 리서치는 동적인 정보(에너지 사용,  이전 유틸리티와의 상호작용등), 와 함께 정적인 정보들을 묶는것을 제안한다. (수입레벨 및 집 크기등) , 머신러닝 모델들은 결국 프로그램 참여자율을 20% 만큼 올릴수있었다.


3. 미개적된 에너지 효율 기회를 찾아라 

Behavioral messaging can make a powerful impact on energy efficiency. But if you really want to motivate people to save energy, you need to tie it to personalized insights and analysis. Breaking down a home’s energy consumption by appliance — air conditioning, water heating, and so on — can help customers understand their energy behavior and pinpoint their biggest savings opportunities.

That’s why we wrote a usage disaggregation algorithm, which takes in meter reads, weather data, household characteristics, and other variables to create a personalized profile of a customer’s home energy use — and corresponding efficiency advice. The end result looks like the chart below, which can be embedded everywhere from a customer web portal to the utility bill.

pie-chart_sized1

Best of all, the algorithm even works for customers who don’t have smart meters.


4. 어떻게 소비자들이 그들의 집을 데우는지 결정 Accurate, personalized energy insights boost customer engagement. The reverse may also be true: homes and businesses that receive the wrong advice could be more likely to disengage, and tune out their utilities.

One place that comes into play is home heating. Utilities don’t always know how customers heat their homes — and while natural gas efficiency tips are helpful for people with gas heaters, they might be a turn-off for those with electric ones.

Machine learning techniques can ensure that customers get the right advice every time. By analyzing load curves from homes with known heating types, an algorithm can consistently predict customers’ heating hardware based on usage data alone.


5. 자동 조절 온도장치 설정값 옵티마이징 

In the same way, utilities can also feed smart meter data and weather patterns into a machine learning algorithm to estimate how homes and businesses are setting their thermostats. The algorithm’s output might look something like this:

Screen Shot 2015-03-09 at 5.47.21 PM

Utilities can apply thermostat setpoint estimates, which don’t require any hardware in the home, toward a variety of ends.

They might segment homes with inefficient setpoints, and offer personalized savings advice — including how much money customers could save by choosing a more efficient setpoint. Alternatively, they could use setpoint estimates and weather data to deliver bill forecasts halfway through the billing cycle. Or they could target homes with inefficient afternoon setpoints for demand response programs, like this:

DR homes


6. 전력망하에서 전기 자동차/자전거 집약 

Utilities have a huge incentive to know when electric vehicles — the largest home appliances ever — are plugged into the grid. In the immediate term, it’s a customer engagement opportunity: EV detection would allow utilities to suggest time-of-use rates and encourage overnight charging. Down the road, electric cars could also prove to be an important demand response resource.

Last year, we analyzed the EV owners’ signature energy usage patterns, which are represented below. In the months since, we’ve developed machine learning algorithms that help utilities detect when their customers plug a new EV into the grid.

electric vehicle load curve


7. 오랜기간동안 소비자들을 유지하기 

In Europe and around the world, utilities in competitive markets are working hard to cut churn and win their customers’ loyalty. Helpful, personalized services like high bill alerts and insightful call centers are giving them the tools to succeed. Machine learning can help utilities take customer care a step further — helping them identify homes at greatest risk of switching providers, listen to their concerns, and offer solutions.

A variety of industries — banking, insurance, and telecommunications among them — are already using churn modeling to deliver better customer experiences. By feeding customer characteristics, behavior, and billing information into a machine learning tool, utilities can start doing the same.


This article first appeared in Intelligent Utility on March 23, 2015.

http://tech.kinja.com/interview-with-norman-maurer-netty-vert-x-1119968136 번역 



이 기사는 JVM 상에서 돌아가는 이벤트 드리븐 솔루션에 촛점을 맞춘 뉴 인터뷰 시리즈중 첫번째 부분이다.


Norman Maurer 는 유명한 Netty 라이브러리의 기술리더이자 vert.x 프로젝트의 코어 커미터이다.

Interview with Norman Maurer (Netty/Vert.x)

Netty4 and Vert.x 2.0 릴리즈를 축하드립니다.

감사합니다.

Netty4 에 대해 이야기해볼까요

좋습니다.

다음과 같은 것들 ..음   ChannelHandlers 가 소수의 이벤트를 발생시키는것과 같은것 뿐 만 아니라, 모듈화 방식 변경에 대해 파고들었는데요...

예.. 우리는 몇가지 이슈를 가지고있었어요..무거운 이벤트 생성 GC-presure 같은 .. 그런것들을 바로잡을 시간이라 생각했습니다.


Vert.x 말고 Netty 4 기반의 다른 프로젝트에 대해 인지하시는지? 개인적으로 , 좀더 실세계 예제를 보고 싶은데요..

몇가지 Netty 4로 포팅하려는 프로젝트들은 있어요 뭐.. 그들 대부분은 아직 시작이지만요..


이런 질문을 하게 만드는 이유는 Netty 4 가 매우 큰 진보를 하고있다는게  분명히 다가오는데에 있는데요 (new Buffer handling, ChannelFuture-s, easier resource management, leak detection etc.) 근데 마이그레이션은  꽤 까다롭다고 생각되는데요.

오케이, 몇가지 구체적으로 말해보자면, Vert.x 는 좀 도전적이죠. Vert.x 를 작업하는 팀의 일부분인게 놀라운건 아니지만.. 다음 프로젝트는 제가 알기론 HornetQ 의 Netty 4 적용일거에요.

좀 더 작은 프로젝트들도 몇개있긴해요. 매력적이지만 좀 느리게 진행되는데, 무거운 API 들과 양질의 문서화때문일거에요. 우리의 약한부분이기도 한데 나의 책과 함께 좀 더 나아지게 할것입니다.

전반적으로 오래된 유저들 (새로운 컨셉에 적응해야할) 보다는 새로 진입하는분들이 적응하긴 쉬울꺼에요, 예를들어 Netty 3 유저들은 왜 버퍼들이 이제 pooled 되었는지도 이해해야하고 그들을 릴리즈하는게 누구 책임인지도 이해해야하죠.

이런 변경은 좋은방향으로 나아갈겁니다. 우리의 벤치마크에 따르면 Vert.x 의 pooled 버퍼는 un-pooled 에 비해 10% 스피드업이라는 결과를 보여주고 있어요.


전에 리소스 핸들링에 대해서 다시 생각하고있다는걸 언급했었는데요, 버퍼 리소스를 해제시키는게 누구 책임인가요? pool 에 의해 다 해결되나요?

아니요. 기본적으로 인바운드 트래픽에 대해서 네티의 transport 는 pool 바깥에서 버퍼를 얻습니다. 그리고나서 파이프라인을 통해 버퍼를 통과시키고 , ChannelInboundHandler  는  버퍼를 받아요. 파이프라인의 마지막 핸들러는 그것을 해제시킬 필요가 있습니다.


새로운 메모리릭 감지기는 버퍼가 제대로 해제가 안되었을경우 잡을수 있나요?

우리의 엔코더와 디코더는 자동적으로 버퍼를 해제합니다.  우리는 SimpleChannelInboundHandler 를 제공하고 있는데 이것은 모든  버퍼를 자동 해제합니다. 대조적으로 ChannelInboundHandlerAdapter  그렇지 않아요.  대부분의 경우에서는 전자를 사용할수있을겁니다.  ByteBuf 를 직접 다룰경우나 동일한 버퍼를 채널에 돌려주면서 쓰는곳은 좀 예외겠지만요. (에코서버나 프록시같은). 아웃바운드(즉 write) 경우는 버퍼는 채널에 쓰기가 완료되면서 해제됩니다. 메모리릭 감지기는 이런것들을 감지하고 경고를 로깅할것입니다.


알겠습니다. 좀 더 가벼운 주제로 바꿔보겠습니다. 만약 HTTP server 를 만들필요가 있을때 기반으로 Netty4 나 Vert.x 중 무엇이 좋을까요?  둘다 좋을거 같긴한데요.

글쎄요. 네티는 좀더 로우레벨이죠.  ( 역주 : 개인적으로는 Vert.x 를 로우레벨로 쓰는것은 어떨지  궁금하다, 즉 거의 모든 경우에 있어서 Vert.x 를 사용하고 싶다는 ...)


sockJs 가 Netty4  를  hits  한다고 한다면 ,    Vert.x 구현체를 사용해야하나요? 

바라보는곳이 어딘지에 따라 달라지겠지요. Vert.x 는 네티에 비해 좀더 많은것을 제공하고있어요. 만약 여러가지 애드온이나 라이브러리가  필요하지 않다면 , 그냥 네티가 좋을수도있구요, Vert.x 는 정말 다양한 API 를 제공하고있어요, 또 만약 polyglot 으로 가고 싶다면 Vert.x 죠. 


Vert.x 2.0 에서는 모듈 시스템이 많이 발전된것처럼 보입니다. 이 새로운 구조에 대해서 말해주시겠습니까?  maven repos 에 모듈들을 등록해놓고 가져다 사용한다는것이 참 매력적으로 보입니다.

ok~   우리는 Vert.x 의 많은 부분을 모듈을 사용하여 플러거블하게 바꾸었습니다. 좀 더 가볍게되었지요.  전에는 다양한 언어의 API 들이 무겁게 엮여있었어요. 지금은 모든것이 모듈입니다. 그것은 엔드유저가 사용하기 쉽게 만들었어요. 재사용 컴포넌트를 만드는것도 쉽구요. 모듈에 대한 전체 아이디어는 Node.js 와 비슷합니다. 즉 그들의 모듈을 쉽게 등록하게하는것이죠.


이 모든것은 꽤 환영할만한 변경인거 같습니다.

곧 Scala 나 Clojure 를 지원하는것을 보고싶네요.  php 를 시작한 멤버도 있습니다.


언어 모듈은 엔드유저 API 를 래핑하는건가요? 아니면 좀 더 아랫단에 가깝게 접근하나요?

 언어모듈은 그냥 java API 를 둘러싼 래퍼입니다.


...and as for custom modules, are those polyglot, too?

For custom modules you can depend on whatever other module you want too... so yes. The runnable modules communicate over the EventBus (JSON) so you can use a module and not care about in which language it was written.

I found the polyglot story interesting but to be honest, I have some reservations. In my experience, maintaining mixed projects (mixed in terms of languages) can be painful. Especially when the need of passing native data types arises

I think it is a matter of taste. I would, for example, only choose one language and stick with it when working on a project but if you split your project into different modules, you can mix things easily. For example, one team could work on a db driver and write it in JavaScript as all of them are js devs, another team would work on the core backend but use the db driver via theEventBus (JSON) and even not knowing that it was written in JavaScript. So mainly the polyglot thingy gives everyone freedom and still allows easy reuse.

Yeah - I certainly can see different JVM langs interoperating on a service level (lots of companies are building on the JVM in this fashion i.e. Twitter, Netflix etc.) my comment was more about working within a single project.

I agree - mixed languages in one "module/service" can cause problems in term of maintenance. Personally I would choose one lang and stick with it per "service/module".

The new vert.x website characterizes the EventBus as "actor-like". Where do you think the difference is?

Well, Vert.x gives you all in one platform, with actor based system you still need to stick in your own HTTP server whatever like Spray does. So I see Vert.x more like a complete platform which allows to write async network code, potentially using multiple languages.

I saw a few community projects popping up on the mailing list

Yep. We got a few now...

Can you name a few 3rd party extensions that you are particularly fond of?

The one I really find interesting, which is still in active development, is a module which provides async mysql/postgresql access without using JDBC at all. Db access is still a big problem in the async world.

Yeah, totally. That's a big pain point...

In fact, I just started to port the original code to Netty4 to make sure it is run on the same EventLoop as Vert.x.

The other interesting project I'm waiting for is an async DNS codec. We actually want to make use of this DNS solution in a vert.x module.

I also saw some activities around RxJava support. Are you planning any official module for that?

We are working on it ;)

While we are at modules, I remember seeing a web framework/middleware on top of Vert.x, too

yeah that's called Yoke.

Do you see Vert.x as an end user library or folks should see it as higher abstraction than Netty but lower than say Play?

Basically what we want to offer with Vert.x is an easy way to write network services which can scale and everything on top should go in modules. We hope to make it easy to implement extra features as modules and register them in our module registry.

I was quite surprised to see that vert.x embed got some extra support in 2.0. When we talked about Vert.x a while back, the future of the embedding feature was less clear...

haha... I think one of the really hot things in this area is the .js support

Do you guys plan to have anything special for Nashorn?

We have licensing issues with Nashorn [Nashorn is under GPL2] which is unfortunate.

But what quite interesting in JavaScript land is a node.js module which allows you to write node.js apps on top of Vert.x. It's still work-in-progress but it will provide a relatively easy migration path. Vert.x really out-performs node.js in many ways.

My final questions are about the whole event-driven space which seems to be booming these days.

What do you think about Reactor?

Reactor uses Netty [for their TCPServer/Client implementation], so I'm in contact with them. I think this space is indeed really hot at the moment and we will surely see a lot of traction in the future. The problem is that there are still a lot of legacy libraries around which slow things down.

Jonas Bonér from Typesafe was pushing for a shared vocabulary. Any thoughts on this initiative?

Yeah the manifesto ... To be honest, I think it's nothing new.. In fact, many frameworks do exactly what is written there. For example, all the event related attributes he mentions were also implemented in Netty 3Vert.x also fits into this paradigm quite well.

What other collaborations can you imagine between these various projects (Vert.x, RxJava, Akka, Netty, Finagle, Reactor etc. etc.)?

Finagle is built on top of Netty - so obviously there is collaboration there already. Same for Play[although Play may switch to akka-io at some point but we'll see]. As for Reactor, I helped them with their Netty-based TCP implementation.

Well, that was it. Thanks for your time!

Thank you.


Screen-Shot-2014-07-16-at-1.20.14-PM1

This neat data algorithm unlocks the power of smart grid technology—without using smart meters

Smart meters continue to transform the global utility landscape, offering cutting-edge features for energy providers and consumers alike — from outage detection to real-time consumption feedback.

In the US alone, the number of installed smart meters has approximately quadrupledover the past 5 years. These meters are collectively generating more than 1 billion usage data points every day — enabling vital data insights for utilities and their customers.

The smart grid is growing fast in other regions, too. European Union member states have collectively invested around 4 billion USD across hundreds of smart grid projects over the past decade. The UK is targeting nationwide smart meter coverage by 2020. And Japan’s largest electric company is working to equip all 27 million of its customers with advanced meters.

As projects like these unfold, smart meters are becoming the industry norm. But some regions are farther ahead than others. The Edison Electric Institute predicts that by 2015, only around half of US states will have smart meter penetration rates higher than 50 percent.

Opower-Wave-Savings-through-June-2013_from-Database-version-1

Some states are farther ahead than others with smart meter installations. (Source: IEE Edison Institute, August 2013)

But the energy industry can’t afford to wait until everyone has a smart meter. The value of advanced data insights is just too high. Utility customers around the world — with or without smart meters — want personalized energy analysis today, and utilities have never been more able and interested in delivering it to them.

That’s why Opower’s analytics team developed an algorithm that helps unlock the power of the smart grid not just for customers who have a smart meter, but also for those who don’t.

In particular, the algorithm provides a personalized breakdown of a customer’s energy usage into distinct end-use categories — air conditioning, appliances, hot water heating, and so on. That gives them a better understanding of what’s driving their energy costs.

What’s especially cool is that this “usage disaggregation” feature also works reliably for customers with traditional energy meters. Customers whose meters are read once a month can get a level of insight that’s very similar to what they’d get if their meter logged data every 15 minutes.

Advanced data algorithms can reliably disaggregate a customers’ energy usage into end-use components, even if a customer lacks a smart meter.

So how exactly does the algorithm work, especially for customers whose old-fashioned meters provide relatively sporadic and unspecific energy usage measurements? As you can imagine, the calculations require some clever data analytics — involving a strategic combination of data on historical energy consumption, weather patterns, household characteristics, user input, and other key variables.

More importantly: how do we know the algorithm actually produces accurate results?

The answer hinges on the fact that a smart meter is also capable of being a dumb meter. That is, you can do razor-sharp disaggregation analytics on a smart meter (say, one that takes hourly electric reads), and compare those results to what you’d get if that same meter were actually a “dumb” meter. To simulate this, one can simply mash together a smart meter’s 720 hourly reads (over the course of a month) into one single monthly usage read — which is exactly what a dumb meter would give you.

By following this approach and incorporating a sufficient amount of related data (e.g. historical statistics, weather, household characteristics, etc.), our algorithm is able to produce consistent estimates across the two meter scenarios. This outcome indicates that, at least for certain applications, you can take a dumb meter and turn it into a smarter one.

The larger lesson here has important data science implications for the smart grid and beyond: when existing hardware is lacking in intelligence, you can often compensate for it by applying software intelligence.

And software intelligence can open up many other doors. For example, by building an integrated software platform that can quickly turn back-end calculations like those above into personalized advice for utility customers, you can deliver data insights at a minute’s notice to millions of people, specifically at the moments that matter most.

Consider the personalized email alert that Opower sent last month to 85,000 utility customers in the Northeast immediately before a string of hot days. The communication allowed all customers — even though the vast majority of them did not have smart meters — to see a personalized and disaggregated view of their seasonal electricity consumption, in effect educating them on how savvy use of their thermostat could make a big impact in controlling energy costs during the imminent heat wave.

Bringing intelligence to traditional energy meters is just one of many Owesome data projects we’re excited to be working on.



http://blog.opower.com/2014/07/data-algorithm-smart-grid-without-smart-meters/  펌 

+ Recent posts